perm filename COMSCH.MSG[SCH,LSP]3 blob
sn#787322 filedate 1985-02-21 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00001 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 ENDMK
C⊗;
∂29-Nov-84 1009 RPG Comments on the Preliminary Report (2 pages)
∂28-Nov-84 0331 @MIT-MC:dyb%unc.csnet@csnet-relay.arpa Comments on the Preliminary Report (2 pages)
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 28 Nov 84 03:31:05 PST
Received: from unc by csnet-relay.csnet id a021839; 28 Nov 84 6:30 EST
Received: by unc (4.12/4.7) id AA17493; Tue, 27 Nov 84 23:00:58 est
Date: Tue, 27 Nov 84 23:00:58 est
From: Kent Dybvig <dyb%unc.csnet@csnet-relay.arpa>
Message-Id: <8411280400.AA17493@unc>
To: scheme@mit-mc.ARPA
Subject: Comments on the Preliminary Report (2 pages)
Cc: willc%indiana.csnet@csnet-relay.arpa
Comments on Essential Scheme
These comments are arranged in four categories: lambda syntax, other
special forms, functions, and lexical matters. I feel quite strongly
about the issues I outline here. I have other minor gripes (don't we
all) that I didn't bother to include here, for fear they would water
down the substantive issues.
Lambda Syntax
For several years my Scheme system has had a syntax for lambda
expressions that allows naming of the closure within the body of the
closure for enhanced readability and efficiency. The syntax is:
(lambda name formals . body)
where name is a symbol bound lexically to the closure within body,
formals is a list of formal parameters, and body is a set of zero or
more forms to execute. Name may be omitted and there is no ambiguity
since formals is always a list.
This feature, which I have called "named lambda," allows terse,
referentially transparent recursive function definitions. It is similar
to the optional "rec," special form, but is shorter and perhaps more
easily optimized than rec. It is quite a bit shorter than using
"letrec." For example, the boring factorial function looks like:
(lambda fact (x) (if (= x 0) 1 (* x (fact (- x 1))))),
a totally self contained definition.
Named lambda generalizes to named let, something which seems to have
appeared elsewhere independently. It looks like:
(let name ((x1 v1) (x2 v2) ...) . body),
where name may again be omitted. It translates into:
((lambda name (x1 x2 ...) . body) v1 v2 ...),
and replaces the old iterate expression.
Unfortunately, the partial destructuring provided by the adopted lambda
syntax conflicts with the named lambda syntax, introducing an ambiguity
in some cases, as in:
(lambda f (g x) ...).
Is this a named lambda with two formals or an unnamed lambda with a
single &rest formal?
The partial destructuring syntax is merely a different syntax for the
&rest formal parameter adopted by Common Lisp. The only benefit to be
gained by not using the Common Lisp syntax is that some implementations
might generalize to full destructuring. I think it more likely that we
would want to generalize to optional arguments. Why not just use part
of the syntax adopted by Common Lisp? For now, we could make the &rest
syntax required and the &optional syntax optional.
If everyone dislikes the &rest syntax then I think we ought to resolve
the ambiguity with named lambda by requiring the name to be present.
This would improve code readability and facilitate optimization
regardless of the system the code is executed on.
Other Special Forms
1. The "if" special form should require an else part. This allows the
compiler/interpreter to make a sanity check for the user which might
save some grief, and seems a little cleaner. The macros "when" and
"unless" are the appropriate way to express the intended meaning.
2. Block is a much better name for the statement grouping special form
than "begin". Do we really need an artifact of Algol syntax in our
language? After all, it is a block of code, not a begin of code.
(If someone is worried about clashing with a process blocking function
they should name their function "block-process.")
3. Case expressions should allow single keys without putting them in a
list. The user rarely wants the single key to be a list (eq semantics
and all), so there really isn't any ambiguity. Since single keys are
probably the most common, case expressions will be less bulky.
For example,
(case x ((a) ...) ((b) ...) ((c d) ...) (else ...))
would simply be
(case x (a ...) (b ...) ((c d) ...) (else ...)).
Functions
1. Call/cc should be an alternative to call-with-current-continuation.
How are we ever going to get people to feel comfortable with something
so imposing that it requires 30 keystrokes to type in and takes up half
a line?
2. Transcendental functions should be required only in implementations
with floating point numbers, not just any implementation with numbers
other than integers. In particular, an implementation with rational or
interval arithmetic should not be bound to supporting transcendental
functions. (This was probably just a misstatement in Will's note.)
3. The length function should be generic, returning a value for all
reasonable arguments. There can still be individual functions list-
length, string-length, vector-length, and so on.
Lexical Matters
Scheme systems should be case sensitive. Let's not forget that symbols
have other uses than as Scheme identifiers. How can we implement
prolog-like variables in Scheme without being able to differentiate
between upper and lower case letters within the symbols? Does anyone
really still use an upper-case only terminal? Assuming case
sensitivity, all standard Scheme functions and special forms should be
written with entirely lower-case letters.
∂30-Nov-84 1658 RPG Re: Kent Dybvig's Comments on Scheme
∂30-Nov-84 1545 @MIT-MC:bartley%ti-csl.csnet@csnet-relay.arpa Re: Kent Dybvig's Comments on Scheme
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 30 Nov 84 15:44:40 PST
Received: from ti-csl by csnet-relay.csnet id a008334; 30 Nov 84 18:12 EST
Date: 30 Nov 1984 1147-CST
From: David Bartley <Bartley%ti-csl.csnet@csnet-relay.arpa>
Subject: Re: Kent Dybvig's Comments on Scheme
To: Scheme@mit-mc.ARPA
cc: Bartley%ti-csl.csnet@csnet-relay.arpa
Re: Kent Dybvig's comments on the preliminary report on Scheme
I'd like to briefly add my "votes" on Kent's suggestions and bring up
some additional complexities in the issues he raised.
Lambda Syntax
I prefer (REC name (LAMBDA args . body)) first, then MIT-style
NAMED-LAMBDA second. (I support both.) Will's discussion summarizes
my reasons quite well. We should NOT require all lambdas to be named.
IF without else part
I would not mind requiring WHEN in this case, making (IF a b)
syntactically incorrect. This allows better compile-time checking. I
use WHEN myself to make this distinction and find it useful. BTW, I do
NOT find UNLESS useful, since (WHEN (NOT ..)..) usually reads better.
BLOCK vs BEGIN vs LET()
As Will points out, the real question is whether the construct we have
in mind is a compound expression or an Algol-like block containing both
declarations and expressions to be evaluated. I prefer BEGIN for the
former and LET for the latter.
The problem is that Will suggests (paraphrasing someone else) that
(BEGIN ...) might be equivalent to (LET () ...). I have conflicting
thoughts on this. Pro: if they are the same, then I would vote to
remove BEGIN from the (essential) language as hopelessly redundant with
LET. Con: early papers by Steele and Sussman treat (BEGIN a b) as a
macro for ((LAMBDA (ignored-id) b) a), which is equivalent to
(LET ((ignored-id a)) b). Thus, BEGIN already is a scope-extending
operation.
This has the following practical consequence. What is the scope of FOO
in the following?
(define (f ...)
(define g ...)
(begin
(define foo ...))
body)
I understand that MIT Scheme "promotes" both G and FOO directly under
F. This hack is needed for macros that expand into multiple DEFINEs
and thus must be "wrapped" in something to be spliced in correctly.
If we define BEGIN in terms of LET, then the scope of FOO will have to
be confined to the scope of the BEGIN.
What should we do? I vote that BEGIN indicate a compound expression,
contrary to early Steele&Sussman, and that LET() be used for
Algol-style blocks.
CASE expressions with single keys
We will accept single keys in CASE expressions. If ELSE is to be a
single key in the last clause, however, it will have to be enclosed in
parentheses to avoid being interpreted as "otherwise".
CALL/CC
"CALL-WITH-CURRENT-CONTINUATION" is nonsense to the unititiated, too.
I will support both, but feel like it's silly to have the long name.
Must this be a procedure (e.g. a potential funarg) or can we treat it
as a special form?
Transcendental functions
What's the problem? Just call them "optional" as a group. What we've
been arguing is whether I have "true Scheme" if I have some kind of
representation for "real numbers" but don't support a certain set of
functions. This is irrational!! (sorry)
Generic LENGTH
I strongly abhor Common LISP-style genericity. As Will says, "generic"
operations on numbers with multiple REPRESENTATIONS is one thing, but
trying to be generic across different kinds of things is something
else.
Case-sensitive symbols
I feel very strongly that symbols, as processed by the reader, should
NOT be case sensitive and that (EQ? 'a 'A) be true. STRING->SYMBOL
should preserve case, however, and (EQ? '|a| '|A|) should be false.
-- David Bartley (Bartley @ TI-CSL)
-------
∂01-Dec-84 1049 RPG Kent Dybvig's Comments on Scheme
∂30-Nov-84 2036 @MIT-MC:JINX@MIT-OZ Kent Dybvig's Comments on Scheme
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 30 Nov 84 20:35:53 PST
Date: 30 Nov 1984 23:35 EST (Fri)
Message-ID: <JINX.12067849164.BABYL@MIT-OZ>
From: Bill Rozas <JINX%MIT-OZ@MIT-MC.ARPA>
To: David Bartley <Bartley%ti-csl.csnet@CSNET-RELAY.ARPA>
Cc: Scheme@MIT-MC.ARPA
Subject: Kent Dybvig's Comments on Scheme
In-reply-to: Msg of 30 Nov 1984 12:47-EST from David Bartley <Bartley%ti-csl.csnet at csnet-relay.arpa>
BEGIN should certainly not add another contour. BEGIN is
equivalent to LET() if environments are not first class, but they are
in MIT Scheme. An incremental definition to the environment where the
subexpressions of the BEGIN expression are evaluated would have
different effects depending on whether a new contour was created or
not. I agree that BEGIN (for lack of a better name) should indicate a
compound expression and LET should be used for blocks.
Note that this is not contrary to all early Scheme papers
since in the Revised Report,
(BLOCK X Y) = ((LAMBDA (A B) (B)) X (LAMBDA () Y)).
X is evaluated in the environment where the BEGIN
expression appears, and Y is evaluated in an environment where no
bindings have been added. In a dialect without first-class
environments and without internal definitions, the environment where Y
is evaluated is the same as that where X is evaluated, thus no
contours have been added.
I think that there are only two valid reasons for adding
special forms:
- They provide a convenient syntax for something which
could be expressed only considerably more clumsily without them.
Example: COND (as compared to nested IFs). These are usually macros
which expand into the clumsier form.
- They provide an extension to the language which requires
special handling by the interpreter or compiler. Their effect could
otherwise not be achieved. Example: QUOTE. These are the "TRUE"
special forms.
I don't think that CALL-WITH-CURRENT-CONTINUATION falls in
either class, so there is no need to make it a special form. Besides,
a portable program would not be able to rename it since we have not
specified a way for adding syntactic extensions.
Allowing IF not to have an alternative sub-expression is
convenient because it is clear and eliminates the need for WHEN,
another special form. I don't like the proliferation of special forms
and would object strongly to requiring another one for this purpose.
I can't see that there is a difference between WHEN and IF in terms of
compile-time checking. IF with two subexpressions and IF with three
subexpressions can be treated as different beasts.
I agree with the remaining three points.
- Bill Rozas (JINX@MIT-MC)
∂05-Dec-84 1353 RPG Re: Kent Dybvig's Comments on Scheme
∂04-Dec-84 2015 @MIT-MC:HUDAK@YALE.ARPA Re: Kent Dybvig's Comments on Scheme
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 4 Dec 84 20:15:10 PST
Received: by YALE-BULLDOG.YALE.ARPA; 4 Dec 84 11:49:28 EST (Tue)
Message-Id: <8412041649.AA04845@YALE-BULLDOG.YALE.ARPA>
Received: from YALE-RING by YALE-RES via CHAOS; Tue, 4 Dec 84 11:37:43 EST
Subject: Re: Kent Dybvig's Comments on Scheme
Date: Tue, 4 Dec 84 11:37:46 EST
From: Paul Hudak <Hudak@YALE.ARPA>
To: Bill Rozas <JINX%MIT-OZ@MIT-MC>
Cc: Scheme@MIT-MC, Hudak@YALE.ARPA
In-Reply-To: Bill Rozas <JINX%MIT-OZ@MIT-MC.ARPA>, 30 Nov 1984 23:35 EST (Fri)
Not having participated in the workshop, I've been hesistant about
making comments on the various issues flying by recently, but I couldn't
resist responding to Bill Roza's comments on special forms:
I think that there are only two valid reasons for adding
special forms:
- They provide a convenient syntax for something which
could be expressed only considerably more clumsily without them.
Example: COND (as compared to nested IFs). These are usually macros
which expand into the clumsier form.
- They provide an extension to the language which requires
special handling by the interpreter or compiler. Their effect could
otherwise not be achieved. Example: QUOTE. These are the "TRUE"
special forms.
I think there's one other reason, which in a sense subsumes the first:
*readability*. In particular, with regard to IF having or not having
an alternative sub-expression, I find that when I'm reading code and
come across an IF expression, I look very hard to see if it has an
alternative branch. If it doesn't, I look again to be sure I didn't
miss it. I think this "convenient feature" degrades readability.
I agree with Bill Rozas that one shouldn't proliferate special forms,
but my reason is that they are typically complex, and their syntax
becomes hard to remember. However you can't get much simpler than
WHEN and UNLESS. Their use, in my opinion, greatly enhances
readability.
- Paul Hudak (hudak@yale)
∂07-Dec-84 1341 RPG [willc: preliminary report of workshop]
∂06-Dec-84 1909 KMP@MIT-MC [willc: preliminary report of workshop]
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 6 Dec 84 19:09:23 PST
Date: 6 December 1984 22:02-EST
From: Kent M Pitman <KMP @ MIT-MC>
Subject: [willc: preliminary report of workshop]
To: willc%indiana.csnet @ CSNET-RELAY
cc: SCHEME @ MIT-MC
In-Reply-To: <8411120256.AA05811@iuvax.UUCP>
Ok, Will, I'm assuming things decided at the meeting are not cast in
concrete. I certainly take issue with a number of the decisions.
In a few places, I also pick on your notation a bit when I think
its unclear. Thanks for taking the time to write up everything.
Here come the comments...
* I dislike the description of "optional features". In particular,
you say:
Optional features may not be supported by every implementation,
but those that do support a feature will use the same syntax and
semantics for the feature. Hence code that makes use of optional
features will run on any implementation of Scheme that supplies
the optional features.
....
An implementation may extend the language in any way whatsoever,
but code that makes use of extended features is not portable.
These two paragraphs are in conflict and allow for lots of
inconsistency which I will identify at appropriate points later.
The principle troubles, though, are:
* Since a language can be extended in "any way whatsoever",
can such extensions be syntactically in conflict with optional
language features? eg, if #\ is optional and my dialect doesn't
use it, can I then define my own #\ as an extended part of the
required subset?
* In some cases, "required" language features can be redefined
incompatibly by "optional" features. eg,
(COND (FOO => X))
has a well-defined semantics regardless of whether the optional
"=>" feature of COND is present. However, that semantics is not
the same in the two cases.
The point I'm trying to make is that saying something is "optional" says
little if anything unless you at least define that if anyone uses the
syntax corresponding to the optionality in a dialect which doesn't support
the feature, that he is in error. Put another way, extensions may not
invalidate optional features. On the other hand, if this were stipulated
the list of "optional" features to which I objected would be quite lengthy.
* I disagree with calling the string quote character "double quote".
I prefer "doublequote". Since there is a character named "quote", the
phrase "double quote" might designate '' instead of ".
* In discussing what terminates tokens, you should say which of these
characters (presumably all) are also single-character tokens. In
particular, that "((A B)C)" is tokenized
{"(" "(A" " " "B" ")C" ")"}
or {"(" "(" "A" " " "B" ")" "C" ")"}
hangs in the balance. I'm sure no one disagrees, but if you're not going
to be complete about these things, you are sort of wasting your time
just trying to look formal about things.
* I agree with reserving {, [, ], and }, but I would specify that they
may be used as alphabetic according to syntactic escape conventions.
Optionally, the following characters may be delimiters that
terminate symbols:
* What does it mean to say single quote, backquote and sharpsign
may "optionally" terminate tokens? It means expressions
like (JOHN'S COAT) read differently in the different dialects. How
is this distinct from saying "Optionally, the following characters
may be delimiters that do not terminate symbols:"? It only makes
sense to say something is optional if it's going to mean that when
it's present. I bet some dialects treat (JOHN'S COAT) as a two-list
and others treat it as a three-list. Hence, any description of this
relation between dialects can at best say: "The specification takes
no stand on the issue of whether the following delimiters terminate
symbols. Any use of expressions like (JOHN'S COAT) are to be considered
non-portable."
* I would personally prefer if vertical bar had been defined to be
alphabetic, but I am at least happy that it is "not specified" what
its meaning is rather than that it is "optional".
* I strongly oppose the idea of not specifying an escape char for
symbols. You say there is "widespread agreement that ``slashification''
of characters within symbols is a relic that ought to be abandoned."
I am not party to such agreement.
I strongly oppose the idea of eliminating slashification. The Maclisp
conventions of vertical bars made up for the absence of strings. With
their passing, syntactic quoting of lots of chars is very rare, and
in those few cases, I think slashing works fine. It also is a low-cost
mechanism for the printer, since no lookahead is required. Also, it is
uniform with respect to strings, which already use slash for special
chars anyway. Finally, without slash, there would be no way to get
vbars into symbol names, since vbars cannot adequately quote themselves.
On the other hand, slash can work fine in the absence of vbar. So if
one thing is to go as a readsyntax quoter, it should be vbar.
* Will-- Your meta-syntactic use of "..." in a description of what
the "." character does is very confusing.
* Does anyone mind if (. A) is the same as A? It has a certain elegance
to it if you think about it.
(CONS 3 (CONS 2 (CONS 1 0))) => (3 . (2 . (1 . 0))) => (3 2 1 . 0)
(CONS 2 (CONS 1 0)) => (2 . (1 . 0)) => ( 2 1 . 0)
(CONS 1 0) => (1 . 0) => ( 1 . 0)
0 => 0 => ( . 0)
Just a thought.
* I find #!true and #!false to be ugly and visually confusing
with the popular convention of "!" designating something
destructive. It would make more sense for #! to be saved for
something like #. in Common Lisp. I agree #<something>TRUE
is reasonable. I'd have preferred #: for this.
* What does it mean to say:
"Optionally, binary numbers may be written using the #b notation.
Optionally, octal numbers may be written using the #o notation.
Optionally, decimal numbers may be written using the #d notation.
Optionally, hexadecimal numbers may be written using the #x notation."
Presumably this means that dialects not wanting #B, #O, etc. can
use these to mean other things.
* What does it mean to say:
"Optionally, special characters may be written using the #\
notation. If this feature is supported, then the Common Lisp
names for special characters must be supported."
Presumably this means that if I don't want to be able to write special
characters, I can make #\ do something else. In fact, if I want to
use other names than those used by Common Lisp, I can just "not support"
this feature and then "make any extension whatsoever" to my dialect
such that #\ does something completely different, like understand
a different set of character names.
* Was anything decided about whether #!TRUE and #!FALSE would self-evaluate
or whether they required quoting?
* Was anything decided about whether numbers must self-evaluate or
whether a dialect may require quoting?
* Since "optionally, numbers may be written using decimal points and/or
exponents", does this mean that numbers with decimal points are integers
or floating? Does it mean that if I don't support the feature that
I can take the alternate position?
* I notice that the space of symbol names is highly constrained for
the "required subset". A property, however, that should be required
is that within any given dialect, every interned symbol (no matter what
characters it contains) must have a printed representation which is
read-invertable within that dialect. I suspect that all dialects
do this already anyway, but it should be a guaranteed property of the
language since programmers will tend to depend on such things and should
have a guaranteed semantics backing them up.
* What does it mean to make NIL optionally evaluate to the empty list
or optionally evaluate to false. What happens if I make it false and
then try to run my code in another dialect where it's the empty list
and where the two are not the same thing. The definition of optionality
says that code written in the optional subset will run correctly in
another dialect supporting optional features. It doesn't seem to me
like a good idea to optionally define a symbol as able to take on several
values and then be able to write meaningful code.
* It is specified that "the order of evaluation within an application
is not specified". I would prefer "combination" or "expression" to
"application" as a matter of terminology to avoid confusion with the
application that happens in the APPLY function, which doesn't involve
evaluation at all.
* I don't like the name LETREC; I preferred LABELS. Neither is very
suggestive of anything; the latter is at least a real word.
* Will-- I don't like the use of the term "mistake" throughout the report,
at least without defining it formally. In my dialect, it connotes
an unintentional error and it seems to me that if the user
intentionally did the offending thing, it would not be a mistake.
I would say "error" in its place, or define the term "mistake"
formally early on.
* I disagree with the various forms that claim it to be a "mistake"
to use certain return values, allowing some implementations to
signal an error. I don't agree that such errors can ever be
detected at the language level; I would like a formal description
of exactly when it is believed that such an error could be signalled.
The forms in question are: IF, COND, SET!, DEFINE, DEFINE!, CASE,
SET-CAR!, SET-CDR!, and VECTOR-SET!.
* The semantics of (COND (X => Y)) is messy due to optionality as
described earlier.
* Will-- I would name the ... sequences in definitions of things
like LET, COND, etc which use multiple sequences. I realize you
use them right to left, but that could be made more apparent.
Perhaps ..foo.. instead of ...
* I find the name SET! both ugly and redundant. The "!" convention
as originally created by the T people identifies a destructive
variant of an otherwise-non-side-effecting operation. So, for
example, APPEND and APPEND!, etc. Logically, there could be a
CHANGE-CAR and CHANGE-CAR!, one of which was
(LAMBDA (C V) (CONS V (CDR C)))
and the other which was
(LAMBDA (C V) (SET (CAR C) V) C)
In any case, I strongly think that the primitive for assigment
should be SET and not SET!. In fact, since no one likes assignment
anyway, I don't see any reason why anyone should object to just
leaving this undefined in the standard. It would only discourage
people from writing destructive code. But I would be very unhappy
to see T change the name of SET to SET!. Similarly, I strongly
dislike the name SET-CAR! and SET-CDR!.
* The definition of DEFINE refers to the "top-level" definition of
a variable. I don't believe it's established what "top-level" means,
so this definition is pretty muddy. Further, what is the implication
of this definition upon doing (LAMBDA (X) (DEFINE X X))?
I am very discouraged that the (DEFINE (fn . args) ...) syntax isn't
required. This means that any portable code must be ugly, meaning
no one is likely to ever write truly portable code, meaning this
standard is a farce.
* It is silly to require that there be at least one form in a (BEGIN).
It is easy for macros to come up with situations where there are
no forms to put there and as long as the macro's caller doesn't
depend on the value, it shouldn't matter. The return value of a
BEGIN with no forms should just be undefined.
* The fact that (LET* ...) cannot admit an optional name reveals an
asymmetry which I find very distasteful. I suggest that named LET
be left to implementors as an "arbitrary" extension not to be mentioned
in any common subset.
* I would prefer to have REC be called LABEL. Again, at least it's English.
* I don't see any good reason to have DO not bind RETURN. Can someone
elaborate on that?
* The description of DEFINE inside LAMBDA is inconsistent with the
earlier description of DEFINE as creating a toplevel definition.
I think this should be a non-standard extension. I see no reason to
dignify it with any "optional" status.
* The term "top-level binding" is again completely vague in DEFINE!'s
definition.
* The definition of optionality specified that if an optional feature
was present, the dialect should prefer to call it by the "optional"
name. This is somewhat inconsistent with making SEQUENCE an optional
synonym for BEGIN. Since it is not encouraged for use and is not going
to exist in all dialects, is there any sense to including it here?
* The entire section on datatypes is hopelessly muddled. About the only
useful thing said is that anything which is a first class object must
have unlimited extent.
* In the sentence "There is an object which represents both false and
the empty list", I cannot discern whether that means there may/must
be one/two objects filling that description. Shouldn't we say,
"False and the empty list must be represented as first class objects
and that object {may,must} [not] be distinct." or some such.
* Since datatypes are not declared to be disjoint, it isn't necessary
to mention that characters may be represented as numbers, except perhaps
as a footnote to remind the forgetful reader. Strings can be represented
as numbers, too, the way things are written.
* Was there really anyone who thought streams shouldn't be first class
objects? Since datatypes aren't disjoint and such objects could be
indistinguishable from numbers or arrays or whatever, is there really
a reason to care?
* The unary procedure not should be defined to return "a true value if
its argument is false and a false value if its argument is not false."
... rather than "if its argument is true." for the second part.
* I suggest renaming CALL-WITH-CURRENT-CONTINUATION (or CALL/CC) to just
CONTINUE. eg,
(CONTINUE (LAMBDA (C) (IF (FOO) (F C) (G C))))
Anyone else support this?
* By the way, saying the escape procedure has unlimited extent doesn't say
it can be called more than once. Does everyone agree to either stipulate
that or not?
* If "the unary predicate NUMBER? is true of numbers and false of
everything else" and "the unary predicate INTEGER? is true of
integers and false of everything else", I don't suppose this says
much since types are not disjoint and so strings are not necessarily
not numbers and need not necessarily cause INTEGER? or NUMBER? to
return false. Certainly characters needn't yield false from NUMBER?
or probably from INTEGER?. As such, these predicates are of limited
value.
* Of what point is it to make claims about what "almost all implementations"
will do for real numbers? Either they're required to or they aren't.
The rest belongs in some other document.
* I don't agree that allowing generalization of +, -, *, and / to arbitrary
arity is a good idea or even well-defined. eg, the proper generalization
of - to arity 1 is (- 3) => 3, not (- 3) => -3. Hence, specifying that unary
negation is optional is in conflict with specifying that - may be generalized.
* Will-- The discussion of QUOTIENT/REMAINDER and of CONS/CAR/CDR should
use the word "respectively" in the appropriate places. When I first read
that QUOTIENT and REMAINDER return the quotient and remainder, I spent
an unduly long time flipping back pages looking to see if you'd allowed
multiple values before I realized that it was silly for both these functions
to do the same thing or for that same thing to be what it had first looked
to me like they're doing.
* I don't see why MIN/MAX should be restricted from arity 0. They should
just return the smallest and largest representable numbers. I guess as
long as they aren't defined to signal an error in this case, individual
dialects could be extended anyway.
* It should be made explicit whether (= 1 1.0) is defined to work. Note that
this may be tricky since even (= 1.0 1.0) won't necessarily work if the
1.0's were computed rather than read and have different bit patterns that
are too tiny to make a difference on output.
* It is silly to specify that implementations may "optionally" support
numbers that are non-integers. Why not just define that (NUMBER? x)
doesn't imply (INTEGER? x). That definition wouldn't mean that
every number wasn't an integer, it would only mean that every number
wasn't necessarily a number.
Specifying that "almost all implementations" will support this option
is again silly and might in pathological situations be misleading.
* Is the definition of (TRUNCATE x) really correct? It looks like it must
be screwed up on the negative side near 0. eg, (TRUNCATE -0.5) doesn't
have the same sign as 0.5 does it? Or is there a negative 0?
* The meaning of "interning" a symbol should be specified.
* It should be stated explicitly that CAR and CDR of the empty list
is not defined.
* What's this nonsense about pairs being maybe indistinguishable from
vectors of length 2. Is there a good reason for that? It doesn't really
matter since numbers haven't been defined as distinguishable from
strings either, but it somehow offends my sense of aesthetics to see
this note here. Is this due to some problem with Maclisp HUNK2's or
something unrelated?
* In "The following descriptions use the notion of a proper list. The
set of proper lists is the smallest set satisfying:
the empty list is a proper list
a pair x whose CDR is a proper list is a proper list,
provided (MEMQ x (CDR x)) is false."
I think MEMQ isn't the function that you want, but I find it amusing
to see the language defined meta-circularly in this way (since MEMQ
is almost certainly defined to terminate only on proper lists and may
even want to type-check proper-list-ness).
* Is the function LENGTH defined to err or to not return when given
a circular list? What about an otherwise improper (ie, dotted) list?
* The definition of APPEND is poor. It should be defined with NAMED-LAMBDA
for safety in situations where APPEND gets redefined. Also, its text
description is too windy.
* I see no reason for APPEND! to be defined to possibly side-effect
either arg. This may force lots of needless copying in order to write
provably correct programs. I can't imagine a definition of APPEND!
which would want to destructively modify its last argument.
* All these definitions (APPEND, REVERSE, ...) are ugly due to the
silly restrictive version of DEFINE. I certainly wouldn't want my
students programming like that.
* It should be stated in English what happens if LIST-REF and LIST-TAIL
fall off the end. I assume it follows from the definitions of CAR/CDR
that such is a signallable error.
* There should be MEMQ?, MEMV?, and MEMBER? to match MEMQ, MEMV, and
MEMBER. This enhances garbage collection since if these functions
are only being used for truth value, you don't want to hold pointers
to potentially large list structures. Also, it enhances debugging since
if F is a function on booleans, (F (MEMQ X Y)) will receive true/false
rather than a list or false. Ditto for ASSQ?, ASSV?, and ASSOC?.
* You specify no order of evaluation for MAPCAR. I think you mean no
order of "application".
* I dislike the asymmetry between MAPCAR and MAPC.
MAPC has no defined return value, MAPCAR does.
MAPC has defined order of application, MAPCAR does not.
In short, they have nothing really in common other than they type
of their args. I think they should not be named so similarly.
* I oppose the names MAPCAR and MAPC.
T calls these MAP and WALK, respectively.
The generic form of MAPCAR, which is the only thing for which
arbitrary order of application would make sense (since lists are
only sequentially accessible anyway), has no business being called
MAPCAR.
* With respect to the questions about VECTOR->LIST, I think the
right thing to say is that the conses it returns are mutable,
not that the result is necessarily a "new object", since if the
result is the empty list (eg, from an empty vector), I wouldn't
want the implication to be that
(NOT (EQ (VECTOR->LIST #()) (VECTOR->LIST #())))
since it follows from that that more than one false value must
(rather than "may") be possible.
* What happens if VECTOR-REF is out of range?
* VECTOR-SET! is ugly. It should at least be called SET-VECTOR-REF!
for symmetry with the other SET- things. Personally, I hate the
! and would strongly prefer just SET-VECTOR-REF.
* The relation between OBJECT-HASH and the GC should be specified.
Do things get GC'd if no other pointers exist to them? Also, it
might help to distinguish this kind of "hash" from the number that
comes from SXHASH in Maclisp. It took me a second to realize you
weren't talking about that.
------- End undelivered message -------
------- End undelivered message -------
∂07-Dec-84 1342 RPG Slashification
∂07-Dec-84 1327 JAR@MIT-MC Slashification
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 7 Dec 84 13:27:00 PST
Date: 7 December 1984 16:26-EST
From: Jonathan A Rees <JAR @ MIT-MC>
Subject: Slashification
To: KMP @ MIT-MC
cc: SCHEME @ MIT-MC
In-reply-to: Msg of 6 Dec 1984 22:02-EST from Kent M Pitman <KMP>
I think I was the one who suggested that there be neither slashification
nor vertical barring. If you tell the T printer not to use either \ or
| and it comes across a symbol which requires quoting anyhow, it prints
#[Symbol "foo"]. (If you have \ but not |, as in T, you need to have a
way to print the null symbol anyhow; it really does come out as #[Symbol
""] - try it.) If print tables exist or if STRING->SYMBOL accepts any
string whatsoever, then it is necessary to have some way for unusual
symbols to read and print, but there's not really any need to wire down
a special character for the purpose, because e.g. some # syntax could be
used.
Some people wanted to throw away \ in favor of just | ; people didn't
agree when I suggested going with \ but not | ; so we just decided that
the need was unimportant enough that neither syntax was required. I'm
not convinced that there needs to be a defined, portable way to print
unusual symbols, although maybe you could talk me into such a position.
Jonathan
∂09-Dec-84 1252 RPG MIN/MAX, DO/RETURN
∂08-Dec-84 1556 KMP@MIT-MC MIN/MAX, DO/RETURN
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 8 Dec 84 15:56:47 PST
Date: 8 December 1984 18:55-EST
From: Kent M Pitman <KMP @ MIT-MC>
Subject: MIN/MAX, DO/RETURN
To: SCHEME @ MIT-MC
References: Msg of 6 Dec 1984 22:02-EST from Kent M Pitman <KMP@MIT-MC.ARPA>
Jonathan Rees had a good points about a couple of my earlier remarks...
* Making MIN/MAX return smallest/largest representable number would
be infeasible in implementions with bigfloats or bignums. I should
have thought of that. (I must have been thinking too hard about
the flonum case; sorry.) I'll withdraw that suggestion.
* Making DO bind RETURN would have been a bad idea since it would "wire"
the name RETURN. I withdraw my disparaging remarks about its not binding
that name.
-kmp
∂09-Dec-84 1304 RPG Re: critique of preliminary report
∂09-Dec-84 1142 @MIT-MC:willc%indiana.csnet@csnet-relay.arpa Re: critique of preliminary report
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 9 Dec 84 11:42:25 PST
Received: from indiana by csnet-relay.csnet id a022287; 9 Dec 84 14:34 EST
Received: by iuvax.UUCP (4.12/4.7)
id AA06716; Sat, 8 Dec 84 22:50:32 est
Date: Sat, 8 Dec 84 22:50:32 est
From: Will Clinger <willc%indiana.csnet@csnet-relay.arpa>
To: KMP@mit-mc.ARPA
Subject: Re: critique of preliminary report
Cc: scheme@mit-mc.ARPA
I am posting this in response to Kent Pitman's excellent critique
of the preliminary report on the October workshop.
ERRORS AND OVERSIGHTS
I should have said:
Numbers, strings, characters, and #!TRUE and #!FALSE are
self-evaluating, which means they need not be quoted. Symbols and
pairs are not self-evaluating. It is not specified whether other
things are self-evaluating.
The semantics of (BEGIN) is not specified.
TRUNCATE could be defined by
(DEFINE TRUNCATE
(LAMBDA (X)
(IF (NEGATIVE? X)
(- 0 (FLOOR (ABS X)))
(FLOOR X))))
VECTOR-REF takes a vector v and an nonnegative integer n such
that n < (VECTOR-LENGTH v) and returns ... .
----------------------------------------------------------------
WHAT DOES "OPTIONAL" MEAN?
The intent of the workshop was that extensions must not conflict
with optional features, and I should have said so. In fact some
features were made optional in order to prevent dialects from
using certain syntaxes for other purposes. It follows that #b,
#o, #d, #x, #\, and so on can't be used for any purpose other
than to support the optional feature.
I agree that the "optional" stuff about NIL and T is ridiculous.
What I should have said instead is that implementations that want
to treat NIL and T as constants can do so. In effect this is a
warning against using NIL and T to name variables. Note that
(FOO . NIL) cannot read the same as (FOO) no matter what.
I didn't mean to say that "optional" names or features are to be
preferred to essential names or features, and I don't think I did.
----------------------------------------------------------------
POOR WRITING IN THE PRELIMINARY REPORT
I agree with Kent Pitman's points regarding the use of "...",
"mistake", "respectively", and "double quote".
I agree that the set of single character tokens needs to be
better defined. I would like to include appendixes in the final
report with more rigorous descriptions of the lexical syntax, the
context-free syntax, and the denotational semantics of Scheme.
How about "The order of evaluation within a procedure call is not
specified"? That isn't quite true, of course -- normal order
evaluation is prohibited.
I should have said that some single object represents both false
and the empty list. There may be other objects representing
false. Might there be other objects representing the empty list?
(I hope not.)
I agree that it would be better to have said that (NUMBER? x)
doesn't imply (INTEGER? x).
I hope everyone noticed that it is an error to take the CAR or
CDR of the empty list.
The definition of a proper list in the preliminary manual was a
joke. The notion of a proper list has to do with finiteness,
which is not first order definable.
----------------------------------------------------------------
LEXICAL MATTERS
It is not specified whether single quote, backquote, sharp sign,
and vertical bar are delimiters that terminate symbols. The
status of these characters would be decided by a general rule
that emerged during discussion at the workshop, but not everyone
agreed to the general rule. The general rule is: Special
characters that come in pairs (left and right parenthesis, left
and right bracket, left and right curly brace, doublequote)
should be delimiters while other special characters (period and
so on) should be preceded by a space if they are to be used in
their special sense. Semicolon isn't really an exception to the
pattern because the end of line would be a delimiter anyway.
Whether vertical bar should be a delimiter according to the rule
depends on whether vertical bar is a special character that comes
in pairs, which is not specified. According to the rule single
quote, backquote, and sharp sign should not be delimiters.
> * I agree with reserving {, [, ], and }, but I would specify that they
> may be used as alphabetic according to syntactic escape conventions.
I don't understand "...according to syntactic escape conventions".
I have yet to hear of a good use for slashification of symbols.
If there is no compelling need for a feature, we should leave it
out. The workshop allowed slashification but did not encourage
it.
> * I notice that the space of symbol names is highly constrained for
> the "required subset". A property, however, that should be required
> is that within any given dialect, every interned symbol (no matter what
> characters it contains) must have a printed representation which is
> read-invertable within that dialect. I suspect that all dialects
> do this already anyway, but it should be a guaranteed property of the
> language since programmers will tend to depend on such things and should
> have a guaranteed semantics backing them up.
We probably ought to require something along these lines, but the
property you desire is too strong. It is enough that every
symbol read by the reader be printed in a form that will read
back in as the same symbol. Other symbols are too random to
worry about. In fact, I would be satisfied if only the required
set of symbol names have the property you desire. For what it's
worth, Franz Lisp does not have the property, but I doubt that
its lack is a significant cause of dissatisfaction with Franz
Lisp.
----------------------------------------------------------------
SPECIAL FORMS
LETREC vs LABELS, REC vs LABEL, and SET vs SET! are matters of
taste that needed to be decided one way or the other, and were.
History plays a significant role in decisions such as these. The
first two matters had history on both sides, but the history of
SET in traditional Lisp counts against it.
In some cases the IF, COND, and CASE special forms return
unspecified values. The SET!, DEFINE, and DEFINE! special forms,
and the SET-CAR!, SET-CDR!, and VECTOR-SET! procedures, always
return unspecified values. Except for the result of the DEFINE
form, I wrote that it is a "mistake" to use these undefined
values, and except for the results of the DEFINE and DEFINE!
forms I wrote that implementations could signal an error if the
values that were returned were "used". What I wrote is faithful
to the workshop's decisions, but I can be accused of deviating in
the cases of DEFINE and DEFINE! .
Kent is right to question what we meant by "using" a value. The
answer is that we don't know. Obviously an implementation could
return a strange value that would cause an error to be signalled
if an attempt were ever made to take its CAR or its successor; an
implementation could also arrange for an error to be signalled if
the value were ever an operand to EQ? or CONS; and there are no
doubt other things that an implementation could do as well. I
will not offer a formal description of the circumstances under
which an error could be signalled, because I want to leave room
for implementations to experiment with different approaches.
A DEFINE form is at "top level" iff it it not nested within any
other form. This definition clearly establishes circumstances
that do not count as top level, but it does not establish any
circumstances that do count as top level. I think each
implementation will have to specify circumstances under which a
DEFINE form has the described semantics, and those circumstances
would then count as top level for that implementation.
The Abelson and Sussman book uses DEFINE inside LAMBDA as
syntactic sugar. Scheme's future is tied to the success of that
book, so the sugar was dignified with "optional" status. The
optional status of the sugar required that ordinary DEFINE be
restricted to "top level".
The (DEFINE (fn . args) . body) sugar was dignified with
"optional" status for the same reason. Kent has acquired a taste
for this sugar, but I consider it a violation of orthogonality
that is doubly pernicious: (1) it discourages programmers from
thinking of procedures as objects distinct from their names; (2)
it discourages programmers from using procedures with local
state.
I agree that the optional status of named LET should imply an
optional status for named LET*, or else both named LET and named
LET* should be left out altogether. By the way, David Bartley
asks whether the body of a named let is within the scope of the
name. The Revised Report has the body within the scope of the
name. Does anyone want to argue that the body should be outside
the scope of the name?
When it was asked if DO should bind RETURN, someone said "Of
course not!", and that was the end of the discussion. If DO were
to bind RETURN I believe that DO would be the only construct in
even the optional language to bind an identifier that does not
appear explicitly in the code, and I see no reason to condone
that kind of anomaly.
----------------------------------------------------------------
DATATYPES
I was disappointed that we were unable to agree that any
datatypes were disjoint. The fact that the workshop participants
insisted on a special note to the effect that characters need not
be a distinguishable data type leads me to believe that despite
their votes most participants assumed that other data types were
distinguishable. In the final report I intend to recommend that
at the very least numbers, symbols, and pairs should be disjoint;
does anyone object?
Even without disjointness of data types, the NUMBER? and INTEGER?
predicates are useful because they define the domains of other
procedures. Thus if all strings are numbers then it must be
possible to subtract strings.
The nonsense to the effect that pairs might be indistinguishable
from vectors of length 2 is to remind folks not to assume that
pairs and vectors are disjoint. I agree that the note is out of
place and ugly.
I expect streams will be a Scheme data type, but as Kent points
out they might overlap with (say) numbers. Nonetheless the
STREAM? predicate will be useful because anything of which the
STREAM? predicate is true will have to support the operations on
streams.
----------------------------------------------------------------
PROCEDURES
I think it went without saying that escape procedures can be
called more than once. We didn't feel any need to say that the
addition procedure could be called more than once, either.
The generalizations of +, -, *, and / to arbitrary arity must
follow Common Lisp, so the ambiguity is not great. There is some
ambiguity, however, because Common Lisp has all sorts of rules
about integers, rationals, floats, and complexes that don't apply
to Scheme.
MIN and MAX are restricted from arity 0 because some
implementations don't have a smallest or largest representable
number.
Implementations should not be allowed to signal an error on
something like (=? 1 1.0). I don't think we should specify that
the result is true because you might want to implement a Scheme
in which "approximations" are a subtype of the numbers, in which
1.0 is read in as an approximation, and in which all equality
comparisons involving approximate numbers return false.
(If you haven't caught on by now, I'm perfectly comfortable with
the fact that Scheme is far less tightly specified than Common
Lisp. I believe Scheme can continue to represent the future of
Lisp only by being open to experimentation.)
I would rather not talk about "interning" a symbol, because
there is no need to talk about it when all symbols are interned.
I have to use words like "interned" to talk about implementations
in which not all symbols are interned, but for definitions I
would like to refer people to manuals for traditional Lisps.
LENGTH is defined on proper lists, and its action on anything
else is not specified.
I can't imagine a definition of APPEND! that would want to mutate
its last argument either. Shall we say it doesn't?
Kent's suggestions for MEMQ?, MEMV?, MEMBER?, ASSQ?, ASSV?, and
ASSOC? are interesting. Several people at the workshop, notably
Hal Abelson, expressed the belief that we ought to reconsider the
entire MEMBER/ASSOC complex of procedures.
MAPCAR and MAPC have their historical names for historical
reasons, and I would not be averse to renaming them eventually.
If MAPCAR is defined as follows in an implementation that
evaluates right to left, then the list of results is constructed
from right to left:
(DEFINE MAPCAR
(LAMBDA (F L)
(IF (NULL? L)
'()
(CONS (F (CAR L)) (MAPCAR F (CDR L))))))
I agree with Kent that the result of VECTOR->LIST cannot
reasonably be guaranteed to be a new object.
VECTOR-SET!, SET-CAR!, and SET-CDR! seem inconsistent to me too.
I think we should offer a prize for the best rationalization of
these names.
I think garbage collection should be treated as an issue of
performance rather than as an issue of semantics.
Peace, William Clinger
∂09-Dec-84 1307 RPG apology to Franz Lisp
∂09-Dec-84 1145 @MIT-MC:willc%indiana.csnet@csnet-relay.arpa apology to Franz Lisp
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 9 Dec 84 11:44:54 PST
Received: from indiana by csnet-relay.csnet id ab22287; 9 Dec 84 14:36 EST
Received: by iuvax.UUCP (4.12/4.7)
id AA01074; Sun, 9 Dec 84 11:11:13 est
Date: Sun, 9 Dec 84 11:11:13 est
From: Will Clinger <willc%indiana.csnet@csnet-relay.arpa>
To: scheme@mit-mc.ARPA
Subject: apology to Franz Lisp
I apologize to Franz Lisp for claiming that not every interned symbol in
Franz has a printed representation that reads back in as the same symbol.
An example confused me but not Franz Lisp.
Peace, William Clinger
∂10-Dec-84 1123 RPG Scheme conference report
∂09-Dec-84 1648 @MIT-MC:ANDY@SU-SCORE.ARPA Scheme conference report
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 9 Dec 84 16:48:24 PST
Date: Sun 9 Dec 84 16:45:14-PST
From: Andy Freeman <ANDY@SU-SCORE.ARPA>
Subject: Scheme conference report
To: scheme@MIT-MC.ARPA
Will the final report on the scheme conference say anything about
macros? There are substantial issues to be resolved/understood,
especially in scoping and expansion, but can a "least common
denominator" solution be released in the interim? How about a syntax
for defining top-level macros? That can be useful even if everything
else is undefined or optional.
Speaking of top-level solutions, why was ",." left out of the
backquote syntax? (It's like ",@", except that the value of the
following form may be destructively spliced into the result. ",@", as
you recall, non-destructively splices the value of the following form
into the result.) If deeper levels of nesting are going to be
optional, the report has to define what happens.
By the way, unless Scheme is going to define a character set, EOF
isn't necessarily a character. Neither is end-of-line.
Are all symbols interned or not? (Interning isn't necessary to print
code and be able to read it back in later.)
Still, most of these are minor issues. What the final report really
needs is a discussion of the decisions. Some things do not need
explanation (except to historians), like the names lambda and ', but,
for instance, why are true and false self-evaluating? In many cases
the decision itself is unimportant, but the issues are.
The report would also be improved by explicit mention of controversial
issues. So there's no decision; the reasons are still important.
-andy
ps - Why is argument evaluation order unspecified? That was a mistake
in Pascal, but then AND/OR/etc. have a defined order in Scheme. Then
again, so many standard forms have a right-left defined order that
consistency would suggest that user procedures should also.
Is the closure position an argument?
-------
∂10-Dec-84 1137 RPG Scheme conference report
∂09-Dec-84 1648 @MIT-MC:ANDY@SU-SCORE.ARPA Scheme conference report
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 9 Dec 84 16:48:24 PST
Date: Sun 9 Dec 84 16:45:14-PST
From: Andy Freeman <ANDY@SU-SCORE.ARPA>
Subject: Scheme conference report
To: scheme@MIT-MC.ARPA
Will the final report on the scheme conference say anything about
macros? There are substantial issues to be resolved/understood,
especially in scoping and expansion, but can a "least common
denominator" solution be released in the interim? How about a syntax
for defining top-level macros? That can be useful even if everything
else is undefined or optional.
Speaking of top-level solutions, why was ",." left out of the
backquote syntax? (It's like ",@", except that the value of the
following form may be destructively spliced into the result. ",@", as
you recall, non-destructively splices the value of the following form
into the result.) If deeper levels of nesting are going to be
optional, the report has to define what happens.
By the way, unless Scheme is going to define a character set, EOF
isn't necessarily a character. Neither is end-of-line.
Are all symbols interned or not? (Interning isn't necessary to print
code and be able to read it back in later.)
Still, most of these are minor issues. What the final report really
needs is a discussion of the decisions. Some things do not need
explanation (except to historians), like the names lambda and ', but,
for instance, why are true and false self-evaluating? In many cases
the decision itself is unimportant, but the issues are.
The report would also be improved by explicit mention of controversial
issues. So there's no decision; the reasons are still important.
-andy
ps - Why is argument evaluation order unspecified? That was a mistake
in Pascal, but then AND/OR/etc. have a defined order in Scheme. Then
again, so many standard forms have a right-left defined order that
consistency would suggest that user procedures should also.
Is the closure position an argument?
-------
∂10-Dec-84 1142 RPG Scheme conference report
∂10-Dec-84 0930 @MIT-MC:JINX@MIT-OZ Scheme conference report
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 10 Dec 84 09:30:02 PST
Date: 10 Dec 1984 12:24 EST (Mon)
Message-ID: <JINX.12070348515.BABYL@MIT-OZ>
From: Bill Rozas <JINX%MIT-OZ@MIT-MC.ARPA>
To: Andy Freeman <ANDY@SU-SCORE.ARPA>
Cc: scheme@MIT-MC.ARPA
Subject: Scheme conference report
I would like to clarify (hopefully) a few points with respect
to the evaluation of combinations:
The operator position is a "distinguished" argument. In my
opinion it is the only position for which the evaluation order might
be relevant. If reflective procedures are present in an
implementation, then the operator position must be evaluated before
any operand expressions. Since in the common subset there is no
provision for reflective procedures, there is no need to specify any
particular order with respect to the operator.
In Will's last message there was a statement to the effect
that normal order evaluation was prohibited. I find this rather
surprising. One of Guy Steele's Rabbit compiler main points was that
normal order beta-subtitution could be used very successfully to
optimize Scheme code. In the absence of side-effects, and if the
programs terminate when using applicative order, there is no way to
determine whether normal or applicative order is used. A compiler can
use this profitably to postpone evaluation. A think that a more
appropriate statement is that portable programs should not depend on
the evaluation strategy used, ie. they should work in applicative
order, but different implementations may want to use different
strategies.
The order of argument evaluation was left unspecified for
various reasons:
- The order of argument evaluation is only relevant in the
presence of side-effects in some of the argument expressions. Scheme
is predominantly an applicative language (contrast with Pascal), and
good style requires that arguments to procedures (operands of
combinations) be side-effect free. By not specifying the order, it
was felt that code with such side-effects would be discouraged since
its effect would not be predictable.
- An optimizing compiler can do a better job if it has freedom
over the order in which it can execute expressions. Leaving the order
unspecified potentially allows a clever compiler to choose the optimal
order for each combination. Note that the meaning of a program can
only be changed by reordering if the argument expressions contain
side-effects. But this code could not be guaranteed to work in other
implementations with different default order for argument evaluation.
- Leaving the order of argument evaluation unspecified allows
a parallel implementation to evaluate in parallel. Note again that
the only programs whose semantics are not clear (and are therefore not
predictable) are those with side-effects in the operands of a
combination.
A marginally related issue is the issue of macros. In Scheme
there is not so much of a need to provide a macro facility as in other
dialects of Lisp. Macros are needed only to provide "nicer", more
readable syntax. In other dialects they are needed to extend the
langauge since there is no way of encapsulating an expression and an
environment. Freezing ("thunkifying", wrapping in a lambda
expression) an argument encapsulates a context and an expression in a
procedure which can then be invoked at will by the operator procedure.
While syntactically clumsy, it is conceptually very elegant. Note
that an optimizing compiler (Rabbit, for example) can eliminate most
or all of the overhead of freezing and thawing.
∂12-Dec-84 1711 RPG Re: Scheme conference report
∂12-Dec-84 1601 @MIT-MC:dyb%unc.csnet@csnet-relay.arpa Re: Scheme conference report
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 12 Dec 84 16:01:03 PST
Received: from unc by csnet-relay.csnet id aa04450; 12 Dec 84 15:04 EST
Received: by unc (4.12/4.7) id AA29958; Tue, 11 Dec 84 00:04:48 est
Date: Tue, 11 Dec 84 00:04:48 est
From: Kent Dybvig <dyb%unc.csnet@csnet-relay.arpa>
Message-Id: <8412110504.AA29958@unc>
To: ANDY@su-score.ARPA, scheme@mit-mc.ARPA
Subject: Re: Scheme conference report
Relative order of evaluation of the expressions in a combination is
not specified for at least two reasons:
1) It is poor coding style to depend on order of evaluation. If
there is an ordering constraint it should be made explicit
by using "let" or some such.
2) It is a severe constraint on the implementation for the language
to specify an evaluation order. What is easy for one machine
model or architecture may be difficult for another.
Note that Pascal and Scheme are not alone; neither C nor Common Lisp
specify the order of evaluation.
Kent Dybvig
∂12-Dec-84 1847 RPG
∂12-Dec-84 1837 KMP@MIT-MC
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 12 Dec 84 18:37:11 PST
Date: 12 December 1984 21:37-EST
From: Kent M Pitman <KMP @ MIT-MC>
To: dyb%unc.csnet @ CSNET-RELAY
cc: SCHEME @ MIT-MC
Actually, Common Lisp does specify the order of evaluation as left to right.
It's tricky to find the passages in the CL manual which assure this, but
they are there if you look hard enough. The issue came up recently in the
Common Lisp discussion group and subsided after various obscure but recently
conclusive passages were cited.
Anyway, I agree it is reasonable to leave the order unspecified. It may be
computationally infeasible for the compiler to determine when order of
evaluation matters in many situations where "better code" would be available
if such a determination could be made.
∂15-Dec-84 2257 RPG Scheme bibliography
∂15-Dec-84 1655 @MIT-MC:CPH@MIT-OZ Scheme bibliography
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 15 Dec 84 16:55:16 PST
Date: Sat, 15 Dec 1984 19:53 EST
Message-ID: <CPH.12071740976.BABYL@MIT-OZ>
From: CPH%MIT-OZ@MIT-MC.ARPA
To: linus!ramsdell%UUCP@YALE.ARPA (John D. Ramsdell)
Cc: T-Discussion%MIT-OZ@MIT-MC.ARPA, Scheme@MIT-MC
Subject: Scheme bibliography
In-reply-to: Msg of 14 Dec 1984 13:44-EST
Fri 14 Dec 84 09:26:47 est from linus!ramsdell at Mitre-Bedford,
linus!ramsdell%UUCP at YALE.ARPA (John D. Ramsdell)
I believe that this is a complete list of the early Scheme papers:
Steele, Guy Lewis, Jr., and Gerald Jay Sussman. 1975. Scheme: An
interpreter for the extended lambda calculus. Memo 349, MIT
Artificial Intelligence Laboratory.
Steele, Guy Lewis, Jr., and Gerald Jay Sussman. 1976. Lambda: The
ultimate imperative. Memo 353, MIT Artificial Intelligence
Laboratory.
Steele, Guy Lewis, Jr. 1976. Lambda: The ultimate declarative. Memo
379, MIT Artificial Intelligence Laboratory.
Steele, Guy Lewis, Jr. 1977. Debunking the "expensive procedure
call" myth. In Proceedings of the National Conference of the ACM, pp.
153-62.
Steele, Guy Lewis, Jr., and Gerald Jay Sussman. January 1978. The
revised report on Scheme: A dialect of Lisp. Memo 452, MIT Artificial
Intelligence Laboratory.
Sussman, Gerald Jay, and Guy Lewis Steele, Jr. May 1978. The art of
the interpreter or, The modularity complex. Memo 452, MIT Artificial
Intelligence Laboratory.
Steele, Guy Lewis, Jr. May 1978. Rabbit: A compiler for Scheme.
Technical report 474, MIT Artificial Intelligence Laboratory.
∂16-Dec-84 2313 RPG continuation terminology
∂16-Dec-84 2039 @MIT-MC:cth%indiana.csnet@csnet-relay.arpa continuation terminology
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 16 Dec 84 20:39:11 PST
Received: from indiana by csnet-relay.csnet id a001364; 16 Dec 84 23:35 EST
Received: by iuvax.UUCP (4.12/4.7)
id AA04292; Sun, 16 Dec 84 17:27:40 est
Date: Sun, 16 Dec 84 17:27:40 est
From: Chris Haynes <cth%indiana.csnet@csnet-relay.arpa>
To: scheme@mit-mc.ARPA
Subject: continuation terminology
We object strongly to the use of the term "escape procedure". The word
"escape" is far to limiting to describe continuations, which are good for so
much more. Also, the term "escape procedure" is strongly associated with the
limited facility by that name provided by most Lisp systems (which isn't good
for much besides "escaping"). If we adopt the old Lisp name, many will
assume that continuations aren't good for anything more than Lisp escape
procedures, and will miss what we feel is one of the most important
distinguishing features of Scheme.
We find the term "continuation" to be quit satisfactory in most contexts, and
it agrees with the name "call-with-current-continuation". However, we
recognize that in the context of denotational semantics or implementation
discussions, there is possibility of confusing continuations as first-class
programming objects and their semantic or implementation counterparts. Thus
additional terminology is desirable when such distinctions must be made, and
to standardize such terminology it should probably be used in the Revised
Revised Report. We suggest the term "continuation object" for this purpose,
though it is not wonderful and we welcome other suggestions.
-- Chris Haynes
Dan Friedman
Eugene Kohlbecker
∂17-Dec-84 1524 RPG continuation terminology
∂17-Dec-84 1523 JAR@MIT-MC continuation terminology
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 17 Dec 84 15:23:41 PST
Date: 17 December 1984 18:23-EST
From: Jonathan A Rees <JAR @ MIT-MC>
Subject: continuation terminology
To: cth%indiana.csnet @ CSNET-RELAY
cc: SCHEME @ MIT-MC
In-reply-to: Msg of Sun 16 Dec 84 17:27:40 est from Chris Haynes <cth%indiana.csnet at csnet-relay.arpa>
Date: Sun, 16 Dec 84 17:27:40 est
From: Chris Haynes <cth%indiana.csnet at csnet-relay.arpa>
We object strongly to the use of the term "escape procedure"...
My only objection to the terms "continuation" and "continuation object"
is that the presence of fluids and/or UNWIND-PROTECT mean that the thing
created by the user-visible CATCH or CONTINUE or CALL-... or whatever is
more than just a continuation, since it implicitly includes a reference
to the dynamic state. There is a low-level thing which really does give
you a continuation, but I expect that the mechanism which does the right
thing with fluids would be the normal one used by users.
I don't have any new alternative term to propose, however.
Of course, we decided not to decide whether or not the essential
scheme's continuation procuration combinator dealt properly with fluids,
since essential scheme doesn't deal with fluids at all. It seems to me
that any fluid mechanism whatsoever must distinguish between
continuations and continuation+state things. As things are, my guess is
that it would be up to the implementor which of these two things the
essential dialect's CALL-... gives you. (This might be the wrong thing,
but I won't argue that position here - this message is already too long.)
(By the way, no Lisp has never had escape procedures per se; the usual
CATCH/THROW mechanism doesn't introduce a new kind of object. I'm not
convinced that the term "escape procedure" is as broken as you think it
is, or that it necessarily has such strong connotations. But I'm not
partial to the term.)
∂18-Dec-84 2009 @MIT-MC:dyb%unc.csnet@csnet-relay.arpa Common Lisp order of evaluation
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 18 Dec 84 20:09:07 PST
Received: from unc by csnet-relay.csnet id ac00691; 18 Dec 84 14:57 EST
Received: by unc (4.12/4.7) id AA12698; Fri, 14 Dec 84 10:29:23 est
Date: Fri, 14 Dec 84 10:29:23 est
From: Kent Dybvig <dyb%unc.csnet@csnet-relay.arpa>
Message-Id: <8412141529.AA12698@unc>
To: KMP@mit-mc.ARPA, SCHEME@mit-mc.ARPA
Subject: Common Lisp order of evaluation
If you have found passages which indicate a specific order of
evaluation, I'd like to know about it. All that I have found
in scouring the manual several times over, in several different
versions over the years, is that arguments are 'processed' in
left-right order *once they reach the caller*. Regardless of
any intentions on the designer's part, or any decisions of
discussion groups, any feature of the language that is not
spelled out is subject to interpretation. There is certainly
nowhere in the manual that states that arguments are evaluated
in left-right order.
∂19-Dec-84 0850 @MIT-MC:rhh@MIT-VAX Re: Common Lisp order of evaluation
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 19 Dec 84 08:49:55 PST
Received: by mit-vax.Mit-chaos.Arpa (4.12/4.8) id AA25785; Wed, 19 Dec 84 11:40:26 est
Date: Wed, 19 Dec 84 11:40:26 est
From: Bert Halstead <rhh@mit-vax>
To: KMP@mit-mc.ARPA, SCHEME@mit-mc.ARPA, dyb%unc.csnet@csnet-relay.arpa
Subject: Re: Common Lisp order of evaluation
The following passages out of the Common Lisp book may be relevant:
"setf carefully arranges to preserve the usual left-to-right order
in which the various subforms are evaluated." (p. 97)
"Macros that manipulate generalized variables must guarantee the
'obvious' semantics: subforms of generalized-variable references
are evaluated exactly as many times as they appear in the source
program, and they are evaluated in the same order as they appear
in the source program." (p. 99)
-Bert
∂19-Dec-84 0912 @MIT-MC:CPH@MIT-OZ Common Lisp order of evaluation
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 19 Dec 84 09:12:05 PST
Date: Wed, 19 Dec 1984 12:12 EST
Message-ID: <CPH.12072705547.BABYL@MIT-OZ>
From: CPH%MIT-OZ@MIT-MC.ARPA
To: Scheme@MIT-MC
Subject: Common Lisp order of evaluation
In-reply-to: Msg of 19 Dec 1984 11:40-EST from Bert Halstead <rhh at mit-vax>
Come on, folks... This really isn't a great place to argue about what
order Common Lisp evaluates its arguments. I for one don't care and
would rather not know.
∂23-Dec-84 0901 @MIT-MC:dyb%unc.csnet@csnet-relay.arpa length vs. list-length
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 23 Dec 84 09:00:57 PST
Received: from unc by csnet-relay.csnet id ag24249; 23 Dec 84 11:49 EST
Received: by unc (4.12/4.7) id AA29885; Sun, 23 Dec 84 01:06:25 est
Date: Sun, 23 Dec 84 01:06:25 est
From: Kent Dybvig <dyb%unc.csnet@csnet-relay.arpa>
Message-Id: <8412230606.AA29885@unc>
To: scheme@mit-mc.ARPA
Subject: length vs. list-length
My primary concern about the length function seems to have been
lost in the ensuing discussions. I do not insist that Scheme
have a generic length function. I do insist that the naming
convention for such functions be consistent.
Doesn't this look a little odd?
(define generic-length
(lambda (x)
(cond ((string? x) (string-length x))
((vector? x) (vector-length x))
((list? x) (length x)))))
I like this much better:
(define generic-length
(lambda (x)
(cond ((string? x) (string-length x))
((vector? x) (vector-length x))
((list? x) (list-length x)))))
We have list-ref, vector-ref and string-ref. Why should we not
have list-length rather than length? While we're at it, why
not have list-append rather than append? Any functions that makes
sense for strings, lists, and vectors should be named with
the appropriate type-name prefix. (Even though the language
doesn't specify a reverse function for vectors or strings
and since such functions would be reasonable, the reverse
function for lists should be named list-reverse.)
∂23-Dec-84 2330 @MIT-MC:mw%brandeis.csnet@csnet-relay.arpa list-length
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 23 Dec 84 23:30:17 PST
Received: from brandeis by csnet-relay.csnet id a026028; 24 Dec 84 2:15 EST
Received: by brandeis.ARPA (4.12/)
id AA06312; Sun, 23 Dec 84 22:31:44 est
Date: 23 Dec 1984 22:30-EST
From: mw%brandeis.csnet@csnet-relay.arpa
In-Real-Life: Mitchell Wand,faculty
Subject: list-length
To: scheme@mit-mc.ARPA
Cc: dyb%unc.csnet@csnet-relay.arpa
Message-Id: <472707016/mw@brandeis>
I vote for list-length over length, too. I think we just went over
that one a little too rapidly at the meeting.
-- Mitch Wand
∂24-Dec-84 1350 @MIT-MC:mw%brandeis.csnet@csnet-relay.arpa list-length
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 24 Dec 84 13:50:48 PST
Received: from brandeis by csnet-relay.csnet id a026028; 24 Dec 84 2:15 EST
Received: by brandeis.ARPA (4.12/)
id AA06312; Sun, 23 Dec 84 22:31:44 est
Date: 23 Dec 1984 22:30-EST
From: mw%brandeis.csnet@csnet-relay.arpa
In-Real-Life: Mitchell Wand,faculty
Subject: list-length
To: scheme@mit-mc.ARPA
Cc: dyb%unc.csnet@csnet-relay.arpa
Message-Id: <472707016/mw@brandeis>
I vote for list-length over length, too. I think we just went over
that one a little too rapidly at the meeting.
-- Mitch Wand
∂28-Dec-84 1428 JAR@MIT-MC list-length
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 28 Dec 84 14:28:26 PST
Date: 28 December 1984 17:29-EST
From: Jonathan A Rees <JAR @ MIT-MC>
Subject: list-length
To: SCHEME @ MIT-MC
In-reply-to: Msg of 23 Dec 1984 22:30-EST from mw%brandeis.csnet at csnet-relay.arpa
LIST-APPEND, LIST-REVERSE, LIST-SORT, LIST-MAPCAR, ...
∂13-Jan-85 1322 JAR@MIT-MC list-length
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 13 Jan 85 13:21:53 PST
Date: 13 January 1985 16:22-EST
From: Jonathan A Rees <JAR @ MIT-MC>
Subject: list-length
To: willc%indiana.csnet @ CSNET-RELAY
cc: SCHEME @ MIT-MC
In-reply-to: Msg of Fri 4 Jan 85 15:50:17 est from Will Clinger <willc%indiana.csnet at csnet-relay.arpa>
Date: Fri, 4 Jan 85 15:50:17 est
From: Will Clinger <willc%indiana.csnet at csnet-relay.arpa>
Pardon me, but I am not certain whether your recent message (LIST-APPEND,
LIST-REVERSE, LIST-SORT, LIST-MAPCAR, ...) was in support of or in derision
of the proposal that the operations on lists be prefixed by LIST- .
Neither, really. The intent of the message was to ask "where do you
draw the line?". E.g. APPEND / STRING-APPEND have the same problem that
LENGTH / STRING-LENGTH do. If you're not careful you end up putting
data type prefixes on everything (as you might have to in a strongly typed
language without unions, like PASCAL) or nothing (as in a fully generic
language like Smalltalk). What policy should we adopt?
∂13-Jan-85 1328 KMP@MIT-MC policy to adopt
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 13 Jan 85 13:28:24 PST
Date: 13 January 1985 16:28-EST
From: Kent M Pitman <KMP @ MIT-MC>
Subject: policy to adopt
To: SCHEME @ MIT-MC
i think we should clearly re-emphasize what our rationale is for wanting
any standard before deciding a policy. if the only purpose is to be able
to write papers that use common syntax, the policy might want to be different
than if we plan to port code. as much as we are not trying to make a
Common Scheme, some of the trends in this discussion have leaned pretty
strongly in that direction, perhaps overly so in some cases. i can elaborate
if it becomes appropriate to do so; i'll let it go at that for now.
-kmp
∂13-Jan-85 2117 @MIT-MC:CPH@MIT-OZ Scheme String Operations: the Report
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 13 Jan 85 21:17:18 PST
Date: Mon, 14 Jan 1985 00:17 EST
Message-ID: <CPH.12079391220.BABYL@MIT-OZ>
From: CPH%MIT-OZ@MIT-MC.ARPA
To: David Bartley <Bartley%ti-csl.csnet@CSNET-RELAY.ARPA>,
Jensen%ti-csl.csnet@CSNET-RELAY.ARPA,
Oxley%ti-csl.csnet@CSNET-RELAY.ARPA, Scheme@MIT-MC
Subject: Scheme String Operations: the Report
In-reply-to: Msg of 13 Jan 1985 15:14-EST from CPH
Here is the preliminary report... Comments and suggestions?
----------------------------------------------------------------------
Scheme String Operations
Notes:
[] Strings are mutable vectors of characters.
[] The string datatype described here has very little dependence on
the character datatype from which it is built. Thus it may correspond
to either of Common Lisp's STRING or SIMPLE-STRING datatypes. In MIT
Scheme, the character datatype is a special extended ASCII.
[] Historically, two different methods have been used to specify
substrings: [1] a starting index and a length, and [2] a starting
index and an ending index. Often both methods have been used for
different operations in the same data abstraction.
The arguments in favor of one method or the other seem fairly
uninteresting, but it does seem important to pick one method and use
it exclusively. The latter method has been chosen, for no
particularly good reason.
Naming conventions:
[] Character-wise operations are normally specified in two forms, one
which operates on a substring, and another which operates on an entire
string. The former is usually named "SUBSTRING-..." and the latter
"STRING-...".
[] Operations for which case sensitivity has meaning are specified in
two forms, a case-sensitive version and a case-insensitive version.
The latter is given the suffix "-CI" to distinguish it; if the
operation is a predicate, the suffix precedes the "?" at the end of
the name.
[] Currently, operation pairs which are "directed" will be given
similar names, such as, "STRING-FIND-NEXT-CHAR" and
"STRING-FIND-PREVIOUS-CHAR". However, there are 3 different syllable
pairs used to distinguish these operations: "LEFT"/"RIGHT",
"PREVIOUS"/"NEXT", and "FORWARD"/"BACKWARD". In each case the
particular pair was chosen for its meaning, but it may be better to
name them more consistently.
!
----------------------------------------------------------------------
Definitions
[] The phrases "X is a string" and "the string X" mean that X is an
object which satisfies the STRING? predicate. Similarly, "X is a
character" means that X satisfies the CHAR? predicate.
[] The "length" of a string is the number of characters that it
contains. This number is a non-negative integer that is determined at
the time of the string's creation.
[] The phrase "X is a valid index of Y", where Y is a string, means
that X is a non-negative integer that is strictly less than the length
of Y. The index 0 refers to the first character, 1 to the second,
etc.
[] The phrase "<X,Y,Z> is a substring" means that:
1. X is a string.
2. Both Y and Z are non-negative integers.
3. Z is less than or equal to the length of X.
4. Y is less than or equal to Z.
This refers to that substring of X starting at Y (inclusive) and
ending at Z (exclusive). If Y is equal to Z, the substring is empty.
[] The directions "left" and "right" refer to numerically decreasing
and numerically increasing indices, respectively. Alternate names for
these directions are (respectively): "previous" and "next"; "backward"
and "forward".
!
----------------------------------------------------------------------
Basic Character Operations
These are the operations on characters that are required by the string
data abstraction. The names of these operations are irrelevant; only
the functionality is required.
(CHAR-EQUAL? CHAR1 CHAR2)
(CHAR-EQUAL-CI? CHAR1 CHAR2)
These predicates are true iff CHAR1 and CHAR2 are the same character.
The former operation is case sensitive, while the latter is not. Both
CHAR1 and CHAR2 must be characters.
(CHAR-LESS? CHAR1 CHAR2)
(CHAR-LESS-CI? CHAR1 CHAR2)
These predicates are true iff CHAR1 is strictly less than CHAR2 in the
character ordering. The former operation is case sensitive, while the
latter is not. Both CHAR1 and CHAR2 must be characters.
(CHAR-UPPER-CASE? CHAR)
(CHAR-LOWER-CASE? CHAR)
These predicates should be true only of upper and lower case
alphabetic characters, respectively. CHAR must be a character.
(CHAR-UPCASE CHAR)
(CHAR-DOWNCASE CHAR)
These operations convert alphabetic characters to upper and lower
case, respectively. If CHAR is not alphabetic, it is returned. CHAR
must be a character.
Character-sets
A character-set is an abstract object that represents a (mathematical)
set of characters. Character-set searches are most useful for parsing
strings. The required operations on character-sets are:
(CHAR-SET-MEMBER? CHAR-SET CHAR)
A predicate which is true iff the character CHAR is a member of the
character-set CHAR-SET.
!
----------------------------------------------------------------------
Basic String Operations
These are the most primitive of the string operations. All other
string operations can be constructed from these.
(STRING-ALLOCATE LENGTH)
This operation allocates a new string, of the given LENGTH, and
returns it. The contents of the string are unspecified. LENGTH must
be a non-negative integer.
(STRING? OBJECT)
This is a boolean operation which is true of all strings.
(STRING-LENGTH STRING)
This operation returns the length of STRING, which must be a string.
The value is a non-negative integer.
(STRING-REF STRING INDEX)
This operation returns the INDEX'th element of STRING, a character.
STRING must be a string, and INDEX must be a valid index of STRING.
(STRING-SET! STRING INDEX CHAR)
This operation stores the character CHAR as the INDEX'th element of
STRING. STRING must be a string, INDEX must be a valid index of
STRING, and CHAR must be a character.
!
----------------------------------------------------------------------
Standard Operations
These operations are useful enough to deserve being "standard" in most
implementations.
(MAKE-STRING LENGTH CHAR)
This allocates a new string of the given LENGTH, and initializes all
of its elements to CHAR. LENGTH must be a non-negative integer, and
CHAR must be a character.
(STRING-FILL! STRING CHAR)
This fills the string STRING with the character CHAR.
(STRING-NULL? STRING)
This predicate is true only of strings whose length is zero. STRING
must be a string.
(SUBSTRING STRING START END)
This operation returns a new string, which is the substring designated
by <STRING, START, END>; this must be a valid substring designator.
(STRING-APPEND STRING1 STRING2)
This operation returns a new string, which is the concatenation of
STRING1 followed by STRING2, both of which must be strings. This
operation may (optionally) be n-ary, for n greater than 0.
(STRING-COPY STRING)
This operation returns a new copy of STRING, which must be a string.
(STRING->LIST STRING)
(LIST->STRING CHARS)
These operations convert between strings and lists of characters.
STRING must be a string, and CHARS must be a list of characters. The
result of either operation is a newly-created object of the
appropriate form.
!
----------------------------------------------------------------------
Motion Primitives
These operations are useful because they can be used to construct many
other string operations, e.g. SUBSTRING and STRING-APPEND. If strings
are implemented in the usual way on conventional machines, these
operations are extremely simple to implement. It is also possible to
in-line code them in some cases.
(SUBSTRING-FILL! STRING START END CHAR)
This fills the substring <STRING, START, END> with the character CHAR.
(SUBSTRING-MOVE-RIGHT! STRING1 START1 END1 STRING2 START2)
(SUBSTRING-MOVE-LEFT! STRING1 START1 END1 STRING2 START2)
These operations destructively copy the substring <STRING1, START1,
END1> to the string STRING2 starting at the index START2. It must be
the case that <STRING2, START2, (+ START2 (- END1 START1))> is a
substring; this latter substring is destructively modified to contain
the contents of the former substring.
The operations differ only when the two substrings overlap, i.e. when
STRING1 and STRING2 are EQ? and the index sets of the substrings are
not disjoint. In this case, the operations are defined to copy the
elements of the first substring serially. SUBSTRING-MOVE-RIGHT!
copies the first substring from left to right, while
SUBSTRING-MOVE-LEFT! copies from right to left. Thus, for example,
the two operations can be used to shift groups of characters right or
left, respectively, within a given string.
!
----------------------------------------------------------------------
Comparison Primitives
Each group of comparisons contains these operations:
1. Compare two substrings, case sensitive.
2. Compare two strings, case sensitive.
3. Compare two substrings, case insensitive.
4. Compare two strings, case insensitive.
The groups are described as a unit since they are essentially very
similar. In all cases the arguments must be valid substrings or
strings.
(SUBSTRING-EQUAL? STRING1 START1 END1 STRING2 START2 END2)
(STRING-EQUAL? STRING1 STRING2)
(SUBSTRING-EQUAL-CI? STRING1 START1 END1 STRING2 START2 END2)
(STRING-EQUAL-CI? STRING1 STRING2)
These are boolean equality predicates, which are true only when the
two (sub)strings are the same length and contain the same characters.
(SUBSTRING-LESS? STRING1 START1 END1 STRING2 START2 END2)
(STRING-LESS? STRING1 STRING2)
(SUBSTRING-LESS-CI? STRING1 START1 END1 STRING2 START2 END2)
(STRING-LESS-CI? STRING1 STRING2)
These are boolean order predicates, which compare the (sub)strings
using the standard dictionary order; i.e. the leftmost unequal
characters determine the order, or if no such character exists, the
shorter (sub)string is less. The character ordering is determined by
the character operations CHAR-LESS? and CHAR-LESS-CI?.
(SUBSTRING-MATCH-FORWARD STRING1 START1 END1 STRING2 START2 END2)
(STRING-MATCH-FORWARD STRING1 STRING2)
(SUBSTRING-MATCH-FORWARD-CI STRING1 START1 END1 STRING2 START2 END2)
(STRING-MATCH-FORWARD-CI STRING1 STRING2)
These operations compare the (sub)strings from left to right,
returning the number of characters that were the same.
(SUBSTRING-MATCH-BACKWARD STRING1 START1 END1 STRING2 START2 END2)
(STRING-MATCH-BACKWARD STRING1 STRING2)
(SUBSTRING-MATCH-BACKWARD-CI STRING1 START1 END1 STRING2 START2 END2)
(STRING-MATCH-BACKWARD-CI STRING1 STRING2)
These operations compare the (sub)strings from right to left,
returning the number of characters that were the same.
!
----------------------------------------------------------------------
Character Search Primitives
Each group of searches contains these operations:
1. Search substring for character, case sensitive.
2. Search string for character, case sensitive.
3. Search substring for character, case insensitive.
4. Search string for character, case insensitive.
5. Search substring for character in set.
6. Search string for character in set.
In all cases the arguments must be valid substrings, strings,
characters, or character-sets, as appropriate.
(SUBSTRING-FIND-NEXT-CHAR STRING START END CHAR)
(STRING-FIND-NEXT-CHAR STRING CHAR)
(SUBSTRING-FIND-NEXT-CHAR-CI STRING START END CHAR)
(STRING-FIND-NEXT-CHAR-CI STRING CHAR)
(SUBSTRING-FIND-NEXT-CHAR-IN-SET STRING START END CHAR-SET)
(STRING-FIND-NEXT-CHAR-IN-SET STRING CHAR-SET)
These operations return the index of the leftmost occurrence of the
character CHAR (or character-set CHAR-SET) within the given
(sub)string, or #!FALSE if there is no occurrence.
(SUBSTRING-FIND-PREVIOUS-CHAR STRING START END CHAR)
(STRING-FIND-PREVIOUS-CHAR STRING CHAR)
(SUBSTRING-FIND-PREVIOUS-CHAR-CI STRING START END CHAR)
(STRING-FIND-PREVIOUS-CHAR-CI STRING CHAR)
(SUBSTRING-FIND-PREVIOUS-CHAR-IN-SET STRING START END CHAR-SET)
(STRING-FIND-PREVIOUS-CHAR-IN-SET STRING CHAR-SET)
These operations return the index of the rightmost occurrence of the
character CHAR (or character-set CHAR-SET) within the given
(sub)string, or #!FALSE if there is no occurrence.
!
----------------------------------------------------------------------
Case
Each of the following groups contains the following operations:
1. A predicate to determine a substring's case.
2. A predicate to determine a string's case.
3. A operation to destructively set a substring's case.
4. A operation to destructively set a string's case.
The meanings of the operations should be clear from their names.
(SUBSTRING-UPPER-CASE? STRING START END)
(STRING-UPPER-CASE? STRING)
(SUBSTRING-UPCASE! STRING START END)
(STRING-UPCASE! STRING)
(SUBSTRING-LOWER-CASE? STRING START END)
(STRING-LOWER-CASE? STRING)
(SUBSTRING-DOWNCASE! STRING START END)
(STRING-DOWNCASE! STRING)
(SUBSTRING-CAPITALIZED? STRING START END)
(STRING-CAPITALIZED? STRING)
(SUBSTRING-CAPITALIZE! STRING START END)
(STRING-CAPITALIZE! STRING)
The (sub)string in the ...CAPITALIZE... operations may not be null.
!
----------------------------------------------------------------------
Appendix: An Implementation
The following is an examplary implementation of the above string
operations (in terms of the basic operations). This implementation is
intended to supplement the preceding specification by providing an
explicit formal description of the semantics of the operations.
There is no error checking in this code, since it was felt that it
would obscure the form somewhat.
!
;;;; Standard Operations
(define (make-string length char)
(let ((result (string-allocate length)))
(substring-fill! result 0 length char)
result))
(define (string-fill! string char)
(substring-fill! string 0 (string-length string) char))
(define (string-null? string)
(zero? (string-length string)))
(define (substring string start end)
(let ((result (string-allocate (- end start))))
(substring-move-right! string start end result 0)
result))
(define (string-append string1 string2)
(let ((length1 (string-length string1))
(length2 (string-length string2)))
(let ((result (string-allocate (+ length1 length2))))
(substring-move-right! string1 0 length1 result 0)
(substring-move-right! string2 0 length2 result length1)
result)))
(define (string-copy string)
(let ((length (string-length string)))
(let ((result (string-allocate length)))
(substring-move-right! string 0 length result 0)
result)))
(define (string->list string)
(let ((end (string-length string)))
(define (loop index)
(if (= index end)
'()
(cons (string-ref string index)
(loop (1+ index)))))
(loop 0)))
(define (list->string chars)
(let ((result (string-allocate (length chars))))
(define (loop index chars)
(if (null? chars)
result
(begin (string-set! result index (car chars))
(loop (1+ index) (cdr chars)))))
(loop 0 chars)))
!
;;;; Motion Primitives
(define (substring-fill! string start end char)
(define (loop index)
(if (not (= index end))
(begin (string-set! string index char)
(loop (1+ index)))))
(loop start))
(define (substring-move-right! string1 start1 end1 string2 start2)
(define (loop index1 index2)
(if (not (= index1 end1))
(begin (string-set! string2 index2 (string-ref string1 index1))
(loop (1+ index1) (1+ index2)))))
(loop start1 start2))
(define (substring-move-left! string1 start1 end1 string2 start2)
(define (loop index1 index2)
(if (not (= index1 start1))
(begin (string-set! string2
(-1+ index2)
(string-ref string1 (-1+ index1)))
(loop (-1+ index1) (-1+ index2)))))
(loop end1 end2))
!
;;;; Comparison Primitives
(define substring-equal?)
(define substring-equal-ci?)
(let ()
(define (make-substring-equal? char-equal?)
(lambda (string1 start1 end1 string2 start2 end2)
(define (loop index1 index2)
(or (= index1 end1)
(and (char-equal? (string-ref string1 index1)
(string-ref string2 index2))
(loop (1+ index1) (1+ index2)))))
(and (= (- end1 start1) (- end2 start2))
(loop start1 start2))))
(set! substring-equal?
(make-substring-equal? char-equal?))
(set! substring-equal-ci?
(make-substring-equal? char-equal-ci?)))
(define substring-less?)
(define substring-less-ci?)
(let ()
(define (make-substring-less? char-equal? char-less?)
(lambda (string1 start1 end1 string2 start2 end2)
(define (loop index1 index2)
(cond ((or (= index1 end1)
(= index2 end2))
(< (- end1 start1)
(- end2 start2)))
((char-equal? (string-ref string1 index1)
(string-ref string2 index2))
(loop (1+ index1) (1+ index2)))
(else
(char-less? (string-ref string1 index1)
(string-ref string2 index2)))))
(loop start1 start2)))
(set! substring-less?
(make-substring-less? char-equal? char-less?))
(set! substring-less-ci?
(make-substring-less? char-equal-ci? char-less-ci?)))
(define substring-match-forward)
(define substring-match-forward-ci)
(let ()
(define (make-substring-match-forward char-equal?)
(lambda (string1 start1 end1 string2 start2 end2)
(define (loop index1 index2 n)
(if (or (= index1 end1)
(= index2 end2)
(not (char-equal? (string-ref string1 index1)
(string-ref string2 index2))))
n
(loop (1+ index2) (1+ index2) (1+ n))))
(loop start1 start2 0)))
(set! substring-match-forward
(make-substring-match-forward char-equal?))
(set! substring-match-forward-ci
(make-substring-match-forward char-equal-ci?)))
!
(define substring-match-backward)
(define substring-match-backward-ci)
(let ()
(define (make-substring-match-backward char-equal?)
(lambda (string1 start1 end1 string2 start2 end2)
(define (loop index1 index2 n)
(if (or (= index1 start1)
(= index2 start2)
(not (char-equal? (string-ref string1 (-1+ index1))
(string-ref string2 (-1+ index2)))))
n
(loop (-1+ index2) (-1+ index2) (1+ n))))
(loop end1 end2 0)))
(set! substring-match-backward
(make-substring-match-backward char-equal?))
(set! substring-match-backward-ci
(make-substring-match-backward char-equal-ci?)))
(define string-equal?)
(define string-equal-ci?)
(define string-less?)
(define string-less-ci?)
(define string-match-forward)
(define string-match-forward-ci)
(define string-match-backward)
(define string-match-backward-ci)
(let ()
(define (string-comparison substring-comparison)
(lambda (string1 string2)
(substring-comparison string1 0 (string-length string1)
string2 0 (string-length string2))))
(set! string-equal?
(string-comparison substring-equal?))
(set! string-equal-ci?
(string-comparison substring-equal-ci?))
(set! string-less?
(string-comparison substring-less?))
(set! string-less-ci?
(string-comparison substring-less-ci?))
(set! string-match-forward
(string-comparison substring-match-forward))
(set! string-match-forward-ci
(string-comparison substring-match-forward-ci))
(set! string-match-backward
(string-comparison substring-match-backward))
(set! string-match-backward-ci
(string-comparison substring-match-backward-ci)))
!
;;;; Character Search Primitives
(define substring-find-next-char)
(define substring-find-next-char-ci)
(define substring-find-next-char-in-set)
(let ()
(define (find-next char-test)
(lambda (string start end key)
(define (loop index)
(and (not (= index end))
(if (char-test key (string-ref string index))
index
(loop (1+ index)))))
(loop start)))
(set! substring-find-next-char (find-next char-equal?))
(set! substring-find-next-char-ci (find-next char-equal-ci?))
(set! substring-find-next-char-in-set (find-next char-set-member?)))
(define substring-find-previous-char)
(define substring-find-previous-char-ci)
(define substring-find-previous-char-in-set)
(let ()
(define (find-previous char-test)
(lambda (string start end key)
(define (loop index)
(and (not (= index start))
(let ((index (-1+ index)))
(if (char-test key (string-ref string index))
index
(loop index)))))
(loop end)))
(set! substring-find-previous-char (find-previous char-equal?))
(set! substring-find-previous-char-ci (find-previous char-equal-ci?))
(set! substring-find-previous-char-in-set (find-previous char-set-member?)))
(define string-find-next-char)
(define string-find-next-char-ci)
(define string-find-next-char-in-set)
(define string-find-previous-char)
(define string-find-previous-char-ci)
(define string-find-previous-char-in-set)
(let ()
(define (string-search substring-search)
(lambda (string char)
(substring-search string 0 (string-length string) char)))
(set! string-find-next-char
(string-search substring-find-next-char))
(set! string-find-next-char-ci
(string-search substring-find-next-char-ci))
(set! string-find-next-char-in-set
(string-search substring-find-next-char-in-set))
(set! string-find-previous-char
(string-search substring-find-previous-char))
(set! string-find-previous-char-ci
(string-search substring-find-previous-char-ci))
(set! string-find-previous-char-in-set
(string-search substring-find-previous-char-in-set)))
!
;;;; Case
(define substring-upper-case?)
(define substring-lower-case?)
(let ()
(define (substring-has-case? char-has-case?)
(lambda (string start end)
(define (loop index)
(or (= index end)
(and (char-has-case? (string-ref string index))
(loop (1+ index)))))
(loop start)))
(set! substring-upper-case?
(substring-has-case? char-upper-case?))
(set! substring-lower-case?
(substring-has-case? char-lower-case?)))
(define substring-upcase!)
(define substring-downcase!)
(let ()
(define (substring-set-case! char-set-case)
(lambda (string start end)
(define (loop index)
(if (not (= index end))
(begin (string-set! string
index
(char-set-case (string-ref string index)))
(loop (1+ index)))))
(loop start)))
(set! substring-upcase!
(substring-set-case! char-upcase))
(set! substring-downcase!
(substring-set-case! char-downcase)))
(define (substring-capitalized? string start end)
(define (loop end)
(or (= end end)
(and (char-lower-case? (string-ref string end))
(loop (1+ end)))))
(and (not (= start end))
(char-upper-case? (string-8b-ref string 0))
(loop (1+ start))))
(define (substring-capitalize! string start end)
(substring-upcase! string start (1+ start))
(substring-downcase! string (1+ start) end))
(define string-upper-case?)
(define string-lower-case?)
(define string-upcase!)
(define string-downcase!)
(define string-capitalized?)
(define string-capitalize!)
(let ()
(define (string-op substring-op)
(lambda (string)
(substring-op string 0 (string-length string))))
(set! string-upper-case?
(string-op substring-upper-case?))
(set! string-lower-case?
(string-op substring-lower-case?))
(set! string-upcase!
(string-op substring-upcase!))
(set! string-downcase!
(string-op substring-downcase!))
(set! string-capitalized?
(string-op substring-capitalized?))
(set! string-capitalize!
(string-op substring-capitalize!)))
∂14-Jan-85 0217 @MIT-MC:dyb%unc.csnet@csnet-relay.arpa Re: policy to adopt
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 14 Jan 85 02:16:54 PST
Received: from unc by csnet-relay.csnet id a000234; 14 Jan 85 5:09 EST
Received: by unc (4.12/4.7) id AA00749; Sun, 13 Jan 85 22:21:36 est
Date: Sun, 13 Jan 85 22:21:36 est
From: Kent Dybvig <dyb%unc.csnet@csnet-relay.arpa>
Message-Id: <8501140321.AA00749@unc>
To: KMP@mit-mc.ARPA, SCHEME@mit-mc.ARPA
Subject: Re: policy to adopt
I have certainly gotten the impression that everyone WAS after
a Common Scheme of sorts, and not just so that papers use common
syntax. Why standardize on so many of the function names? More
importantly, why standardize on the controversial nil/t/boolean
issue? Why have numbers and streams subcommittees?
I think we need more discussion of the details of the proposal
rather than less. Standardizing is dangerous if done only half-
heartedly. We must be absolutely sure of what we standardize on,
since we will likely be stuck with our decisions for several
years.
As for the list-length, list-append issue, I think we ought to take
one of two stances:
Stance 1: sequence-type functions such as ref, length and
append should be prefixed by "list-" for the
list version, "string-" for the string version,
and "vector-" for the vector version.
Stance 2: sequence-type functions such as ref, length and
append should be prefixed by nothing for the
list version, "string-" for the string version,
and "vector-" for the vector version.
The first stance is preferable because of the symmetry, the second
because the names for the list versions (probably the most commonly
used) are shorter. I think that the first stance is less confusing
and less likely to cause trouble.
Cheers,
Kent Dybvig
∂15-Jan-85 1914 @MIT-MC:BARTLEY%ti-csl.csnet@csnet-relay.arpa Re: Scheme String Operations: the Report
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 15 Jan 85 19:14:04 PST
Received: from ti-csl by csnet-relay.csnet id ab07537; 15 Jan 85 22:01 EST
Date: 15 Jan 1985 1445-CST
From: David Bartley <Bartley%ti-csl.csnet@csnet-relay.arpa>
Subject: Re: Scheme String Operations: the Report
To: CPH%MIT-OZ@mit-mc.ARPA, Bartley%ti-csl.csnet@csnet-relay.arpa,
Jensen%ti-csl.csnet@csnet-relay.arpa, Oxley%ti-csl.csnet@csnet-relay.arpa,
Scheme@mit-mc.ARPA
cc: Bartley%ti-csl.csnet@csnet-relay.arpa
In-Reply-To: Your message of 14-Jan-85 0529-CST
Received: from csl60 by ti-csl; Tue, 15 Jan 85 20:45 CST
Chris,
Thanks for mailing out the preliminary report on your proposal for string
operations for Scheme. Here are my initial reactions.
-- You were clearly influenced by Common LISP (CL), since at least 27 of
your functions have direct analogues in CL. You rarely use the same name
as CL uses, however. I tend to agree with most of your name choices, with
one exception: CL uses the suffixes =, <, etc. for case sensitive
comparisons of characters and strings and the suffixes EQUAL, LESSP, etc
for case-insensitive comparisons. I prefer to adopt that convention rather
than inventing the suffix -CI.
We should definitely preserve our conventions regarding suffixed ? and !
characters. Thus, CL's CHAR-LESSP and NSTRING-UPCASE should be renamed
CHAR-LESS? and STRING-UPCASE! for Scheme.
-- We probably need a (CHAR? obj) predicate. The following are useful;
can we agree on their meaning?
(CHAR->INTEGER char) ; CL's CHAR-CODE ?
(INTEGER->CHAR n) ; CL's CODE-CHAR ?
-- The order of arguments to CHAR-SET-MEMBER? is reversed from that of
MEMBER, MEMV, and MEMQ.
-- How is a CHAR-SET represented? Created? Modified?
-- Why distinguish STRING-ALLOCATE from MAKE-STRING? By analogy with
MAKE-VECTOR, shouldn't MAKE-STRING take the second argument optionally?
-- May I assume that string and character values are printed (and read)
as defined by CL?
-- May I assume that all of the operations you defined are procedures,
not special forms?
-- May I assume that SUBSTRING and other names written without ! always
return a copy rather than sharing structure?
-- One way to reduce the number of operations is to combine the STRING-
and SUBSTRING- operations by accepting optional substring operands. For
example, specify only
(STRING-FILL! string char { start { end }} )
instead of
(STRING-FILL! string char) and
and (SUBSTRING-FILL! string start end char)
Note that the <char> argument has been repositioned.
It would be useful to let either <start> or <end> default. In the example
above, we could interpret
(STRING-FILL! string char start)
as filling from <start> to the end of the string.
-- What should happen when indexes cross (start>end) or go outside the
proper range?
-- Are you suggesting that the Basic Character Operations, Basic String
Operations, and Standard Operations be 'required' and the rest be
'optional'? If not, where would you draw the line? Do you use all of
these operations yourself?
-- Is the character datatype in MIT Scheme extended the same way CL has
gone? Do you support font and/or bit info? Have you added operations to
MIT Scheme beyond those you've proposed in your report?
-- In summary, the proposal looks sound and I have no arguments with the
functionality you are proposing. I'll pass on further comments after we've
had time to digest it more thoroughly.
Regards,
David Bartley
-------
∂15-Jan-85 2339 @MIT-MC:CPH@MIT-OZ Scheme String Operations: the Report
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 15 Jan 85 23:38:57 PST
Date: Wed, 16 Jan 1985 02:40 EST
Message-ID: <CPH.12079941473.BABYL@MIT-OZ>
From: CPH%MIT-OZ@MIT-MC.ARPA
To: David Bartley <Bartley%ti-csl.csnet@CSNET-RELAY.ARPA>
Cc: Jensen%ti-csl.csnet@CSNET-RELAY.ARPA,
Oxley%ti-csl.csnet@CSNET-RELAY.ARPA, Scheme@MIT-MC.ARPA
Subject: Scheme String Operations: the Report
In-reply-to: Msg of 15 Jan 1985 15:45-EST from David Bartley <Bartley%ti-csl.csnet at csnet-relay.arpa>
Date: Tuesday, 15 January 1985 15:45-EST
From: David Bartley <Bartley%ti-csl.csnet at csnet-relay.arpa>
-- We probably need a (CHAR? obj) predicate. The following are useful;
can we agree on their meaning?
(CHAR->INTEGER char) ; CL's CHAR-CODE ?
(INTEGER->CHAR n) ; CL's CODE-CHAR ?
-- How is a CHAR-SET represented? Created? Modified?
-- Is the character datatype in MIT Scheme extended the same way CL has
gone? Do you support font and/or bit info? Have you added operations to
MIT Scheme beyond those you've proposed in your report?
In answer to all of these: I purposefully limited my description of
characters (and character sets) to that minimum required to describe
the string abstraction. I did this because I felt that the string
abstraction was very largely independent of the character abstraction;
this method shows the dependence pretty clearly.
Further, I recall that we agreed to do characters "the CL way" at the
workshop, at least so far as syntax. My character datatype is heavily
influenced by CL, and implements bits but not fonts (I don't need them
now). I would be willing to discuss both this and the character set
abstraction if there is interest.
-- You were clearly influenced by Common LISP (CL)...
Not so! The string operations were mostly designed from scratch; I
was unaware that they were so similar.
...CL uses the suffixes =, <, etc. for case sensitive
comparisons of characters and strings and the suffixes EQUAL, LESSP, etc
for case-insensitive comparisons. I prefer to adopt that convention...
Mumble. I don't particularly like those names, but then, I don't
particularly care about the names anyway. I'm not terribly happy with
the names I chose, either. Whatever folks like is fine with me.
-- May I assume that string and character values are printed (and read)
as defined by CL?
Yes; I think that we agreed upon this at the workshop.
-- May I assume that all of the operations you defined are procedures,
not special forms?
Yes!!! (with feeling)
-- May I assume that SUBSTRING and other names written without ! always
return a copy rather than sharing structure?
Yes. Furthermore, operations written with a "!" return undefined values.
-- Why distinguish STRING-ALLOCATE from MAKE-STRING? By analogy with
MAKE-VECTOR, shouldn't MAKE-STRING take the second argument optionally?
-- One way to reduce the number of operations is to combine the STRING-
and SUBSTRING- operations by accepting optional substring operands.
Gee, I guess that I don't feel too good about that. While it is true
that this cuts down on the number of names, I have an irrational fear
of widespread optionalogy. Perhaps it is from overexposure to
flagrant misuse, but I prefer not to use optionals at the "lower
levels" of my code.
Also, I was particularly thinking about the workshop, and I recall
there was very mixed feeling about optional arguments, rest arguments,
etc... I felt that this would be more acceptable to the community as a
whole.
-- The order of arguments to CHAR-SET-MEMBER? is reversed from that of
MEMBER, MEMV, and MEMQ.
Hmm... This was chosen to match other stuff... in particular,
VECTOR-REF, LIST-REF, etc. The convention is: the compound structure
first, the key second. In hindsight, it clearly should match MEMfoo.
-- What should happen when indexes cross (start>end) or go outside the
proper range?
Sorry, I should have spelled that out. It will signal an error, in
all cases (although maybe not at the most reasonable place, in my
implementation). I specific: everywhere where I have described the
arguments as "a string", "a character", "a substring", etc., those can
be construed as requirements. Errors will happen if those things are
not true.
-- Are you suggesting that the Basic Character Operations, Basic String
Operations, and Standard Operations be 'required' and the rest be
'optional'? If not, where would you draw the line? Do you use all of
these operations yourself?
No, I was making no such suggestion; I would be hesitant to do so
without some discussion. I do feel that the 'Basic String Operations'
and 'Standard Operations', with the exception of STRING-ALLOCATE,
STRING-SET!, and STRING-FILL!, are very useful, and perhaps should be
required. But then we have already agreed upon those in the 'Basic'
category, and I think that there would be little disagreement about
the 'Standard' ones.
But I can't really say where one should draw the line; all of these
procedures are useful if one ever does anything even moderately hairy.
Personally, I have used most of these operations pretty freely in the
editor.
I have found, though, that a particular pattern has emerged. The
mutating operations are used (in conjunction with STRING-ALLOCATE)
almost exclusively for the construction of higher level, non-mutating
operations like SUBSTRING and STRING-APPEND. I think that such things
might safely be relegated to the realm of 'system programming', for
those who prefer to make such a distinction.
Anyway, if there is interest in working out the boundaries of
'required' vs. 'optional' here, I am perfectly willing to add more of
my flame to the conflagration.
∂24-Jan-85 1227 @MIT-MC:willc%indiana.csnet@csnet-relay.arpa Purpose of a "common" Scheme
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 24 Jan 85 12:00:43 PST
Received: from indiana by csnet-relay.csnet id aa09096; 24 Jan 85 14:38 EST
Date: Thu, 24 Jan 85 13:27:13 est
From: Will Clinger <willc%indiana.csnet@csnet-relay.arpa>
Received: by iuvax.UUCP; id AA11826; Thu, 24 Jan 85 13:27:13 est
To: scheme@mit-mc.ARPA
Subject: Purpose of a "common" Scheme
Surely different people have different purposes in wanting a more
standardized Scheme. Half the fun is getting everyone to work
together and to agree. It helps to be vague about the purposes.
Here's my view, from a draft introduction to the final report on
the workshop:
Scheme shares with Common Lisp [\ref] the goal of a core
language common to several implementations. Scheme differs
from Common Lisp in that the purpose of the common language
has more to do with porting ideas than with porting code.
It is appropriate therefore that Scheme is much smaller, is
less pervasively specified, and will evolve faster than
Common Lisp.
Let me know by direct mail if you object to that wording.
William Clinger (willc@indiana)
∂31-Jan-85 0921 @MIT-MC:mw%brandeis.csnet@csnet-relay.arpa Resources for Scheme course
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 31 Jan 85 09:17:38 PST
Received: from brandeis by csnet-relay.csnet id a002562; 31 Jan 85 12:13 EST
Received: by brandeis.ARPA (4.12/4.7)
id AA14346; Thu, 31 Jan 85 10:14:16 est
Date: 31 Jan 1985 09:53-EST
From: mw%brandeis.csnet@csnet-relay.arpa
In-Real-Life: Mitchell Wand,faculty
Subject: Resources for Scheme course
To: scheme@mit-mc.ARPA, cth%indiana.csnet@csnet-relay.arpa
Cc: jc%brandeis.csnet@csnet-relay.arpa
Message-Id: <476031183/mw@brandeis>
Brandeis is currently considering creating a laboratory facility for a
course on the style of the Abelson & Sussman book. For this purpose, I
am collecting data on the state of various workstation implementations
of Scheme and related languages (e.g. T). I know about some
implementations, but I'd like to hear about others, and get up-to-date
information even about the ones I've heard of. In particular:
1) What workstations would you recommend for such an undertaking, and
what is the availability of appropriate software?
2) What ratio of students/workstation would you recommend? To give
some idea of the planned intensity of the course, let's assume that we
plan to cover about half of the material in A&S in a semester.
3) How many square feet per workstation should we plan on? What other
things in the physical arrangment of the facility should we watch out
for?
Reply to mw@brandeis on csnet. I will post a summary of replies to
the mailing list if interest warrants. Thanks for your help.
-- Mitch Wand
∂01-Feb-85 0930 @MIT-MC:willc%indiana.csnet@csnet-relay.arpa quotient, remainder, letrec
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 1 Feb 85 09:30:03 PST
Received: from indiana by csnet-relay.csnet id aa04954; 1 Feb 85 12:23 EST
Date: Fri, 1 Feb 85 11:13:53 est
From: Will Clinger <willc%indiana.csnet@csnet-relay.arpa>
Received: by iuvax.UUCP; id AA02926; Fri, 1 Feb 85 11:13:53 est
To: scheme@mit-mc.ARPA
Subject: quotient, remainder, letrec
In the preliminary report on the Brandeis workshop, I said that quotient and
remainder were such that if x, y, q, and r are integers such that x = qy +
r, y is nonzero, r has the same sign as y and has absolute value less than
that of y, then (quotient x y) was q and (remainder x y) was r. Since that
conflicts with Common Lisp, Franz Lisp, T, Scheme 84, Scheme 312, and maybe
MIT Scheme (my documentation on MIT Scheme isn't clear on the issue), I
retract that definition in favor of:
If x, y, q, and r are integers such that x = qy + r, y is nonzero, r has the
same sign as x, and |r| < |y|, then (quotient x y) is q and
(remainder x y) is r.
The definition of letrec in the preliminary report is wrong.
William Clinger
willc@indiana
∂02-Feb-85 1040 @MIT-MC:BARTLEY%ti-csl.csnet@csnet-relay.arpa Scheme String Operations
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 2 Feb 85 10:39:59 PST
Received: from ti-csl by csnet-relay.csnet id ah03968; 2 Feb 85 10:36 EST
Date: 1 Feb 1985 1453-CST
From: David Bartley <Bartley%ti-csl.csnet@csnet-relay.arpa>
Subject: Scheme String Operations
To: Scheme@mit-mc.ARPA, CPH%MIT-OZ@mit-mc.ARPA
cc: SCHEME.Users%ti-csl.csnet@csnet-relay.arpa,
Bartley%ti-csl.csnet@csnet-relay.arpa
Received: from csl60 by ti-csl; Fri, 1 Feb 85 16:57 CST
We at Texas Instruments propose the following revisions to Chris Hanson's
preliminary report on Scheme string operations. Our implementation has
already begun, so we would like to hear objections and other comments as
soon as possible. Thanks!
1. Name changes
For reasonable consistency with Common LISP, we propose the following
name changes:
Preliminary report Our proposal Common LISP
================== ============ ===========
char-equal? char= char=
char-equal-ci? char-equal? char-equal
char-less? char< char<
char-less-ci? char-less? char-lessp
substring-equal? substring=
string-equal? string=
substring-equal-ci? substring-equal?
string-equal-ci? string-equal?
substring-less? substring<
string-less? string<
substring-less-ci? substring-less?
string-less-ci? string-less?
2. Added features
(CHAR? obj) is needed for argument checking, if nothing else.
We agreed at the workshop to include SYMBOL->STRING and STRING->SYMBOL
as essential and STRING->UNINTERNED-SYMBOL as optional.
3. Changes
CHAR-SET-MEMBER should take its arguments in the same order as MEMBER.
4. Essential features
We are not sure how to separate ``essential'' from ``optional''
features, but will probably go with everything labeled ``minimal'' below.
Those labeled ``elective'' will be available but not part of the initial
load of the system.
Basic Character Operations
minimal: CHAR?, CHAR=, CHAR-EQUAL?, CHAR<, CHAR-LESS?,
CHAR-UPCASE, CHAR-DOWNCASE
elective: CHAR-UPPER-CASE?, CHAR-LOWER-CASE?, CHAR-SET-MEMBER?
Basic String Operations
minimal: STRING?, STRING-LENGTH, STRING-REF, STRING-SET!,
STRING->SYMBOL, STRING->UNINTERNED-SYMBOL, SYMBOL->STRING
elective: STRING-ALLOCATE
Standard Operations
minimal: MAKE-STRING, STRING-FILL!, STRING-NULL?, SUBSTRING,
STRING-APPEND, STRING-COPY, STRING->LIST, LIST->STRING
Motion Primitives
minimal: SUBSTRING-FILL!, SUBSTRING-MOVE-RIGHT!, SUBSTRING-MOVE-LEFT!
Comparison Primitives
minimal: SUBSTRING=, STRING=, SUBSTRING-EQUAL?, STRING-EQUAL?,
SUBSTRING<, STRING<, SUBSTRING-LESS?, STRING-LESS?
elective: SUBSTRING-MATCH-FORWARD, STRING-MATCH-FORWARD,
SUBSTRING-MATCH-FORWARD-CI, STRING-MATCH-FORWARD-CI,
SUBSTRING-MATCH-BACKWARD, STRING-MATCH-BACKWARD,
SUBSTRING-MATCH-BACKWARD-CI, STRING-MATCH-BACKWARD-CI
Character Search Primitives
elective: all
Case
elective: all
-- Regards,
David Bartley
-------
∂03-Feb-85 1055 @MIT-MC:dyb%unc.csnet@csnet-relay.arpa Re: Scheme String Operations
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 3 Feb 85 10:55:37 PST
Received: from unc by csnet-relay.csnet id aa08903; 3 Feb 85 13:47 EST
Received: by unc (4.12/4.7) id AA06699; Sun, 3 Feb 85 11:59:27 est
Date: Sun, 3 Feb 85 11:59:27 est
From: Kent Dybvig <dyb%unc.csnet@csnet-relay.arpa>
Message-Id: <8502031659.AA06699@unc>
To: scheme@mit-mc.ARPA
Subject: Re: Scheme String Operations
1. I prefer Chris Hanson's names over the Common Lisp or Bartley names.
I find the CL names confusing and nonsensical. Why does "char=" imply
case sensitivity, and "char-equal" insensitivity? The issue is also
confused because CL does not define "equal" to use "char-equal" as is
implied by the naming convention.
I have been using Common Lisp for over a year since helping to bring
Data General's system into existence. Many of us get confused about
which is which. There would be no confusion with a suffix such as
"-ci". Except for possible commercial interests, I can see no benefit
in copying the mistakes of Common Lisp.
2. I would rather spell out the name "character" in all functions rather
than use the shorter "char" but that's probably too much to ask.
3. I wish there weren't so many string functions. I think Chris Hanson
was on the right track with the very primitive set that can be used to
construct all the others. Perhaps we should have that set plus
"string-ref", "string-equal?", "string-less?" "string-equal-ci?",
and "string-less-ci?", "string->list", and "list->string".
4. I have a "string" function in my system that comes in handy --
(string char1 char2 ... charN) => string of char1 ... charN, N >= 0.
For example:
(string #\a #\b #\c) => "abc".
This is easy to write as (lambda x (list->string x)).
It is particularly useful when creating a cursor-addressing string to
send to the tty.
5. I do not view "string->symbol" and "symbol->string" as string functions.
They are symbol creation and destructuring primitives. In fact, I
strongly object to the names since they are not analogous to other
type-conversion functions such as "string->list" and "character->integer".
I'll say more about this in a separate note.
6. As far as optionality is concerned, I would like to see Scheme adopt
an optional argument mechanism. It reduces the namespace and is really
quite easy to implement efficiently. A function like "member" becomes
completely general, allowing not only "eq?", "eqv?" and "equal?" tests
but also any user-defined test. However, until we have optional
arguments that can be defined using a lambda expression, I do not
think we should have primitives with optional arguments. Either the
language has them or it doesn't.
Kent Dybvig
decvax!mcnc!unc!dyb
dyb.unc@Csnet-Relay
∂03-Feb-85 1059 @MIT-MC:dyb%unc.csnet@csnet-relay.arpa string->symbol, symbol->string
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 3 Feb 85 10:57:50 PST
Received: from unc by csnet-relay.csnet id ab08903; 3 Feb 85 13:47 EST
Received: by unc (4.12/4.7) id AA06730; Sun, 3 Feb 85 12:00:27 est
Date: Sun, 3 Feb 85 12:00:27 est
From: Kent Dybvig <dyb%unc.csnet@csnet-relay.arpa>
Message-Id: <8502031700.AA06730@unc>
To: scheme@mit-mc.ARPA
Subject: string->symbol, symbol->string
"symbol->string" and "string->symbol functions are not analogous to
"list->string" or "string->list".
"list->string" converts the same data from one representation into
another, with no loss of information. The same is true for "string->list",
"list->vector", "vector->list" and "character->integer".
A symbol is more than just an alternate representation for a string.
It can be an identifier or keyword. Some implementations allow symbols
to have property lists. All of this information is lost in the conversion
from symbol to string, some is regained in the opposite conversion.
I tend to think of the name of a symbol as an attribute of that symbol.
"symbol->string" is more correctly called "symbol-name".
Furthermore, whereas one can expect "string->list" to return a new list
made up of unique cons-cells, "string->symbol" may return an existing
symbol. The name "intern" is more appropriate for this functionality.
(Perhaps "hash-make-symbol" would be a better name.)
If making a unique, uninterned symbol is supported, the name "make-symbol"
is appropriate.
If what we are really doing when we say "make-symbol" or "string->symbol"
was changing representations, then "string->symbol" would be fine. But
it isn't a type conversion we are doing, it's symbol creation (or lookup).
It seems analogous to saying "make-vector" should be called "integer->vector"
because we are taking an integer, n, and returning a vector of length n.
The name is only one aspect of a symbol in much the same way that the length
is only one aspect of a vector.
Kent Dybvig
decvax!mcnc!unc!dyb
dyb.unc@Csnet-Relay
∂04-Feb-85 0221 @MIT-MC:dyb%unc.csnet@csnet-relay.arpa Chez Scheme
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 4 Feb 85 02:21:10 PST
Received: from unc by csnet-relay.csnet id aa11924; 4 Feb 85 5:13 EST
Received: by unc (4.12/4.7) id AA00752; Mon, 4 Feb 85 00:29:49 est
Date: Mon, 4 Feb 85 00:29:49 est
From: Kent Dybvig <dyb%unc.csnet@csnet-relay.arpa>
Message-Id: <8502040529.AA00752@unc>
To: Scheme@mit-mc.ARPA
Subject: Chez Scheme
=> Chez Scheme
Chez Scheme is an implementation of Scheme for Vaxes
running 4.2 Bsd Unix. Chez Scheme supports all required and
most optional features of the anticipated Scheme standard.
The first Chez Scheme release will include an extensive
reference manual. A Chez Scheme tutorial is in preparation
for later releases.
Features of Scheme:
o Clean, concise dialect of Lisp
o Lexically scoped (as is Common Lisp)
o Full function closures (first-class, full funarg)
o Tail-recursion reliably translated into iteration
o Full upward/downward continuations
Features of Chez Scheme:
o Incremental native-code (Vax object code) compiler
o Flexible user interface
o Fast-loading compiled files
o Very fast arbitrary precision integer and rational
arithmetic
o Programmable exception handlers
o Support for multi-tasking (timer interrupts,
continuations)
o String and vector operations
o Macros and structures
o Engines (a process abstraction)
Application programs distributed with the first release
of Chez Scheme include a set operation package, a logic
programming subsystem, a lazy-cons facility, and a generic
matrix, vector and scalar multiplication package.
Faster than many Lisp systems, Chez Scheme may be the
fastest Scheme available. On the Vax 11/780, Chez Scheme is
competitive with benchmarks reported for Franz Lisp and
Digital Common Lisp at last summer's AAAI conference in
Austin, TX. For example, Chez Scheme runs the "Tak"
benchmark in 3.4 cpu seconds and the "Deriv" benchmark in
21.9 cpu seconds. The code tested contained no
declarations, used generic arithmetic, and had no inlined
calls. No separate compilation phase is necessary: all code
loaded into Chez Scheme is compiled incrementally.
Chez Scheme is available for mid-March distribution to
US educational institutions only. We will send a license
agreement to interested parties. There is a $400
distribution fee. We are not yet able to do foreign or
commercial distributions, but contact us if you are
interested.
Write for a copy of the license agreement and ordering
information to:
R. Kent Dybvig
Department of Computer Science
New West Hall (035-A)
University of North Carolina
Chapel Hill, NC 27514
USA
decvax!mcnc!unc!dyb (usenet)
dyb.unc@Csnet-Relay (ARPANET)
∂05-Feb-85 1002 JAR@MIT-MC string->symbol, symbol->string
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 5 Feb 85 10:02:15 PST
Date: 5 February 1985 13:03-EST
From: Jonathan A Rees <JAR @ MIT-MC>
Subject: string->symbol, symbol->string
To: dyb%unc.csnet @ CSNET-RELAY
cc: SCHEME @ MIT-MC
In-reply-to: Msg of Sun 3 Feb 85 12:00:27 est from Kent Dybvig <dyb%unc.csnet at csnet-relay.arpa>
Date: Sun, 3 Feb 85 12:00:27 est
From: Kent Dybvig <dyb%unc.csnet at csnet-relay.arpa>
"symbol->string" and "string->symbol functions are not analogous to
"list->string" or "string->list".
"list->string" converts the same data from one representation into
another, with no loss of information. The same is true for "string->list",
"list->vector", "vector->list" and "character->integer".
There definitely is loss of information in list->vector and friends; you
lose eq-ness and location identity. (vector->list (list->vector l))
gives a copy of the list, so rplacas and rplacds will affect different
cells.
A symbol is more than just an alternate representation for a string.
It can be an identifier or keyword. Some implementations allow symbols
to have property lists. All of this information is lost in the conversion
from symbol to string, some is regained in the opposite conversion.
I tend to think of the name of a symbol as an attribute of that symbol.
"symbol->string" is more correctly called "symbol-name".
How do symbols differ from numbers, which can also be identifiers or
"keywords," in this regard? The way I look at things, all associations
of information with symbols are through tables not intimately tied to
the symbol itself. To say that symbols "have" values is like saying
that integers "have" values - getting a value out of an environment,
given a symbol, is completely analogous to getting a value out of a
vector, given an integer. "Essential scheme" intentionally doesn't have
property lists. Symbol eq-ness is global, so property lists would
violate the desire (based on modularity considerations) to avoid global
state; and given reasonable association tables, they are unnecessary.
Furthermore, whereas one can expect "string->list" to return a new list
made up of unique cons-cells, "string->symbol" may return an existing
symbol. The name "intern" is more appropriate for this functionality.
(Perhaps "hash-make-symbol" would be a better name.)
But character->integer (or char->ascii or whatever you want to call it)
may return an existing integer. What is the difference? Since symbols
are immutable objects, like numbers, it's okay that an existing one is
returned.
Reference to hashing doesn't seem very abstract to me...
If making a unique, uninterned symbol is supported, the name "make-symbol"
is appropriate.
I personally don't believe in uninterned symbols (yet?), but if they
existed then make-symbol would be an okay name, maybe better than
string->uninterned-symbol, but I don't much care.
I think that the "->" convention shouldn't have such a terribly strict
meaning. I guess it's my feeling that string->symbol and symbol->string
are acceptable even if they aren't strictly analogous to list->vector
and vector->list.
Jonathan Rees (JAR@MC)
∂05-Feb-85 1007 JAR@MIT-MC Scheme String Operations
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 5 Feb 85 10:06:52 PST
Date: 5 February 1985 13:08-EST
From: Jonathan A Rees <JAR @ MIT-MC>
Subject: Scheme String Operations
To: SCHEME @ MIT-MC
In-reply-to: Msg of Sun 3 Feb 85 11:59:27 est from Kent Dybvig <dyb%unc.csnet at csnet-relay.arpa>
I should point out that there were people at the workshop who felt
strongly that strings should not be mutable. I don't much care one way
or the other. If the things which use the "..." syntax are immutable
then there ought to be such a thing as vector-of-character; also, if
strings are immutable then it's not clear how they are different from
symbols (maybe just that they self-evaluate). But those of you who
strongly think (or thought) that strings should be immutable should
speak up.
Jonathan
∂21-Feb-85 1259 @MIT-MC:willc%indiana.csnet@csnet-relay.arpa Draft delayed to 15 March
Received: from MIT-MC.ARPA by SU-AI.ARPA with TCP; 21 Feb 85 12:59:45 PST
Received: from indiana by csnet-relay.csnet id ad22218; 21 Feb 85 15:20 EST
Date: Thu, 21 Feb 85 13:14:25 est
From: Will Clinger <willc%indiana.csnet@csnet-relay.arpa>
Received: by iuvax.UUCP; id AA04423; Thu, 21 Feb 85 13:14:25 est
To: scheme@mit-mc.ARPA
Subject: Draft delayed to 15 March
The draft of the final report on the Brandeis workshop, which was to have
come out around 1 March, will be delayed to the vicinity of 15 March.
I will be travelling until 4 March and probably won't be able to finish
the draft until I get back. I am sorry for the delay.
William Clinger
Indiana University